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Editorial 

Peggy Johnson 

Assembling and bringing an issue of Library Resources and 
Technical Services (LRTS) to publication and your mailbox is 
a lengthy process. This issue (v. 52, no. 2) has been in progress 
for several months. I am preparing to send the papers for this 
issue to our publisher, ALA Production Services, at the end 
of December. The three-month production process means 
that this issue should reach you in April. During these three 
months, I review the copyedited manuscripts once and the 
page proofs twice. Authors also receive the page proofs for their papers and have 
a short window in which they can send corrections to me to incorporate. 

Authors often ask when their paper will be published. The simple answer is 
that an author could add three to six months to the date the paper is accepted, 
and this will vary depending on when a paper is accepted and the publication 
schedule. For example, I could accept a paper December 30, the day after I 
mail the content for the April issue. That paper would not appear until the July 
issue — a nearly six-month lag. 

Another step on the path to publication is the paper review. All papers 
submitted to LRTS go through a double-blind review process, which means 
the reviewers do not know the name of the author or authors, and authors do 
not know the names of the individuals (usually two) who review their papers. 
Reviewers evaluate the paper according to several criteria. First is whether the 
paper is relevant to the aims and scope of LRTS. The purpose of LRTS is to 
support the theoretical, intellectual, practical, and scholarly aspects of the pro- 
fession of collection management and development, acquisitions, cataloging and 
classification, preservation and reformatting, and continuing resources. In addi- 
tion, reviewers consider documentation of sources and background information 
(appropriate sources should be referenced, and they should be cited according 
to the LRTS format), research methods employed (if this is a research paper), 
analysis of findings or results, and presentation (clarity, format, style, etc.). If a 
paper reports a research project or a practical solution to a problem, the review- 
ers look for a clear statement of the problem and clarity with which the findings 
or results are reported. LRTS publishes papers that report on library-specific 
initiatives, but these must be broadly applicable and of interest and value to other 
librarians and libraries. 

After the reviewers send their assessment of the paper to me, I compile 
their comments into a letter that is sent to the author, often accompanied by the 
edited manuscript. Reviewers can make one of four recommendations: accept, 
accept pending revision, reject, or reject while encouraging the author to make 
substantial revisions and resubmit for a second double-blind review. Our goal is 
to get a response to the author within six to eight weeks. This allows time for the 
reviewers to complete their work and for me to transcribe comments onto the 
manuscript and write a letter to the author or authors. This letter is often lengthy 
because our goal is to give authors everything they need to prepare the best pos- 
sible paper — one that meets the high standards of LRTS and conforms to LPiTS 
style and format requirements. 

Nearly every paper requires some revision, and the time this takes depends 
on the authors. Authors who turn around their papers quickly reduce the time 
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to publication. If a paper goes through the double-blind 
review process again, this adds another six to eight weeks 
to the process. The LRTS editorial board voted in July 2008 
to reduce the length of time in which their reviews must be 
completed, and I am working to reduce the time I spend 
assembling the response to the author. If authors carefully 
read and follow the instructions for authors available on the 
LRTS website (www.ala.org/ala/mgrps/divs/alcts/resources/ 
lrts), examine recent issues, consult the Chicago Manual 
of Style (15th ed.), and check for typographical and gram- 
matical errors, they can save themselves and me effort and 
time in the review and revision process. One improvement 
already implemented is the ability for authors to use the 
automated endnote feature in word processing software. 



This makes managing references much easier. 

I am hoping that the new online manuscript submis- 
sion and peer review system will be in place by the time 
you read this. Until that happens, the system I use to man- 
age submissions and the review process is paper-based and 
manual, excluding receiving and sending papers as e-mail 
attachments. This means I track submissions, reviewers and 
reviews, decisions, and assembly of issues through various 
unlinked files — a tedious and complex process. A link to the 
new system, Editorial Manager from Aries, will be found 
on the LRTS website. We expect this system to simplify and 
expedite the process of bringing authors' papers to publica- 
tion. Please check the site and begin using the online system 
as soon as it is available. 
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"Wholly Visionary" 

The American Library Association, 
the Library of Congress, and the 
Card Distribution Program 

By Martha M. Yee 



This paper offers a historical review of the events and institutional influences in 
the nineteenth century that led to the development of the Library of Congress 
(LC) card distribution program as the American version of a national bibliog- 
raphy at the beginning of the twentieth century. It includes a discussion of the 
standardizing effect the card distribution program had on the cataloging rules and 
practices of American libraries. It concludes with the author's thoughts about how 
this history might be placed in the context of the present reexamination of the LC's 
role as primary cataloging agency for the nation's libraries. 

On October 28, 1901, the Library of Congress (LC) began to distribute its 
cataloging to the libraries of the United States in the form of cards. Herbert 
Putnam, in his 1901 Annual Report of the Librarian of Congress, called the card 
distribution program "the most significant of our undertakings of this first year of 
the new century." 1 By 1909 these cards were being prepared according to interna- 
tional standard cataloging rules agreed upon by the American Library Association 
(ALA) and the British Library Association. 2 Once these rules were adopted by 
other libraries, a cooperative approach to the national bibliography became possi- 
ble. In this new cooperative approach, cataloging done at many different libraries 
could be distributed through the LC cards and made part of the national biblio- 
graphic structure. This ingenious scheme, by which a shared cataloging program 
to lower cataloging costs produced the equivalent of a national bibliography at the 
same time, has become the envy of the rest of the world. This approach is now 
very much taken for granted in the United States, but it could not have happened 
without the conjunction of a number of economic, political, and social factors at 
the turn of the century, without the intervention of several visionary men (among 
them Melvil Dewey, Herbert Putnam, and J. C. M. Hanson), without the actions 
of the ALA and the LC as institutions, and without the inaction of the publishing 
industry. This paper explores how this conjunction of factors came about, and 
then speculates about implications for the current environment of shared catalog- 
ing and the role of the LC therein. 



A Visionary Plan 

The idea had been in the air for half a century or more. The LC's Annual Report 
for 1902 includes a "Bibliography of cooperative cataloguing . . . (1850-1902)," 
which cites articles on this subject from all over the world. In 1852, Charles C. 
Jewett proposed his famous stereotyping plan, by which the Smithsonian would 
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collect cataloging from U.S. libraries and store it in the form 
of stereotyped plates, which would be made accessible to 
any requesting library. The plan failed for technical reasons, 
and because Joseph Henry, Secretary of the Smithsonian 
and Jewett's boss, did not agree that this would be part of 
the proper function of the Smithsonian. 3 

In 1876, the ALA was founded. According to Putnam, 
a "main purpose" in its founding was "a centralization of 
cataloguing work, with a corresponding centralization of 
bibliographic apparatus." 4 At the first meeting of the ALA 
in 1876, Melvil Dewey, instrumental in the ALAs founding, 
proposed that "the preparation of printed titles for the com- 
mon use of libraries" be discussed, stating, "There somehow 
seems to be an idea among certain leaders of our craft that 
such a thing is wholly visionary, at least, their failure to take 
any practical steps in the matter would seem to indicate 
such a belief. Now, I believe, after giving this question con- 
siderable attention, that it is perfectly practicable." 5 

Over the next twenty-five years, the ALA tried a num- 
ber of different ways to put this "visionary" scheme into 
effect. Attempts to induce publishers to furnish cataloging 
for their new books failed to gain the support of librarians 
and publishers for a number of reasons detailed by Scott and 
Ranz. 6 Among them are the following: 

1. Not all publishers cooperated; many were unwilling to 
supply free advance copies of publications for catalog- 
ing. This delayed receipt of cards. 

2. Preparation of quality cataloging would have delayed 
listings that the book trade needed promptly. 

3. Objectives for entries for commercial purposes were 
bound to differ from the objectives for entries for 
library purposes (e.g., there were differences of opin- 
ion over what was acceptable content for annotations). 

4. Publishers were reluctant to support what was per- 
ceived of as another commercial enterprise. 

5. Schemes required that libraries subscribe to all or none 
of the cataloging. 

6. The number of titles covered was too limited for the 
larger libraries, but too large for the smaller libraries 
to justify the expense. 

7. Card sizes in libraries had not yet been standardized. 

8. Librarians were undoubtedly uncertain about the 
permanence of the schemes, any one of which would 
have required "basic and far-reaching changes in their 
normal cataloguing practices."' 

9. Undoubtedly the major factor was the fact that catalog- 
ing rules had not yet been standardized. 

The second approach tried by the ALA, after various 
attempts to enlist the publishers failed, was to try to set up 
a central cataloging bureau under the auspices of the ALA 
itself. This was established at the Boston Athenaeum in 1896 



and operated until the LC began distributing cards in 1901. 
The number of subscribers was never high, largely because 
the all-or-none subscription practice mentioned above was 
maintained. 8 Undoubtedly, lack of standardization also con- 
tinued to be a major factor. 

In 1877, a year after the founding of the ALA, a third 
possibility for the solution to this problem was already being 
suggested by Melvil Dewey: "Is it practicable," he asked, 
"for the Library of Congress to catalogue for the whole 
country?" 9 In the next paragraph, he points out that the first 
step in the solution of the problem will be the development 
of standard cataloging rules. In making these two sugges- 
tions, Dewey outlined the two major ways in which the 
ALA would contribute to the development of the American 
approach to a national bibliography. 

Cataloging Rules and Standards 

Heisey and Henderson describe the many codes being 
followed by American libraries in 1900, when it became 
apparent that the vision of centralized cataloging of which 
librarians had been dreaming might be realized by the LC. 10 
The ALA had approved a code of rules in 1883, but "they 
were not detailed enough to provide a universal American 
standard for cataloging," and they simply became one 
among many codes in use in the country. 11 This might be 
compared to the situation today in which those seeking to 
control electronic resources use various metadata schemes. 
The three leading codes in use were Cutter's, Dewey's, and 
Linderfelt's. 12 Heisey observed that "it was the practice, as 
well as the preference of most cataloguers to use several 
codes, taking what was most advantageous from each." 13 In 
December 1900, the ALA publishing board appointed the 
Advisory Committee on Cataloging Rules, chaired by J. C. 
M. Hanson, head of the cataloging department at the LC, 
and charged the committee with recommending typography 
and format for the new cards and suggesting changes in the 
ALA rales to make them suitable for use in the new cen- 
tralized cataloging project. 14 The LC had already adopted 
cataloging rules in May 1898; these rules were based on 
Cutter's rales. 1 Cutter was one of the members of the ALA 
committee — thus, as Dunkin pointed out, the new code, 
published in 1908, "owed much to Cutter." 16 However, there 
was a significant difference. Cutter's statement of "Objects" 
and "Means" had disappeared, as had his discussions of the 
rationale behind individual rules. According to Dunkin, 
"The new code was a set of rales without reasons." 1 ' 

The rales were not published until 1908, largely because 
of the arrival in 1904 of a request from the Catalogue Rules 
Committee of the British Library Association that the ALA 
consider making the new code a joint Anglo-American code. 
Exchanges by correspondence delayed the publication of 
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the code by several years, but, when published, it repre- 
sented agreement by the Americans and the British on all 
but 8 of 174 rules. 18 

The new rules were also designed to take into account 
the practices of the Library of Congress, which, after all, 
was a large research library. The committee had decided 
soon after its formation that the plan for the code should 
be "carried out for the large library of scholarly character, 
since the small libraries would only gain by full entries, 
while the large libraries must lose if bibliographical fullness 
is not given." 19 Dan Lacy questioned the rationale that full 
entries are needed for large scholarly libraries, which may 
originally have been Hanson's. Hanson was serving as chair 
of the committee when it made this decision. Lacy felt that 
the full cataloging called for in the 1908 rules was the result 
of the ideals of the library movement then burgeoning in 
the United States: 

Cataloging of an elaborate character suited the 
economy of the American public or college library 
of the day, straining to make its necessarily limited 
collection most readily available and most realisti- 
cally useful to its many readers. But if it suited 
the economy of the libraries, it no less matched 
the aspirations of their librarians, in whom were 
joined an austere zeal in scholarship not unlike that 
of Browning's grammarian and an enthusiasm for 
public service that placed the reader's convenience 
far ahead of the cataloger's toil. These aspirations 
were wholly shared by Hanson and his colleagues; 
there is no evidence that they ever questioned 
whether the Library of Congress might have a 
different role, whether it might be called upon to 
acquire and preserve volumes of material whose 
infrequent use made unnecessary, and whose mass 
made impossible, the kind of cataloging suitable for 
a select and actively used collection. 20 

As indicated above, from the beginning the potential 
was present for a clash of objectives at the LC between the 
need to create cataloging suitable to a large research library 
and the desire to produce cataloging useful in other quite 
different libraries in the country. Various reviewers of the 
1908 and subsequent Anglo-American codes never fail to 
note where the LC had forced a decision favorable to it and 
possibly detrimental to public service in other libraries in 
the country, so it is interesting to contrast their reactions 
with the following, somewhat plaintive account by Hanson 
in his 1907 annual report: 

The Library of Congress, mainly on account of the 
distribution of its catalogue cards to other libraries, 
had been obliged to make a number of concessions 



in order to bring its own rules into approximate 
agreement with those of the American Library 
Association. No doubt these concessions have 
served to retard its own work and have at times 
been the cause of some confusion in its records. 
On the other hand, the fact that the rules now 
governing its catalogues have been accepted by the 
two associations which include the great major- 
ity of libraries in the United Kingdom and in the 
United States represents in itself a great advance 
in cooperation and uniformity of methods, and will 
have an influence in its future relations to libraries 
and students, at home and abroad, the importance 
of which can hardly be overestimated. It is felt, 
therefore, that the Library has been fully justified 
in its policy of making liberal changes in rules and 
practice whenever such changes served to further 
a general agreement. 21 

The LC adopted the ALA's List of Subject Headings for 
Use in Dictionary Catalogs, which had been published in 
1895. 22 Prior to 1895, many libraries did not have subject 
catalogs, relying on shelf classification (and reference librar- 
ians) to provide subject access to their collections. 23 One of 
the reasons for the success of the card distribution program 
may have been that it allowed libraries without subject cata- 
logs to build them quickly and cheaply and thus provide an 
added public service. One might posit that the fact that the 
Library of Congress Subject Headings (LCSH) is now such a 
deeply entrenched standard in this country is because of the 
card distribution program that brought its subject descrip- 
tors into so many libraries. 

The LC's decision to develop a new classification sys- 
tem, rather than using the Dewey Decimal Classification 
(DDC), which was then in widespread use, was perhaps the 
most clear-cut instance in which the LC decided to place a 
higher priority on its own needs as a large research library 
over the needs of other libraries in the country. 24 Although 
Young had authorized the creation of a new classification 
scheme, Putnam was very aware of the service the LC could 
provide other libraries were he to reverse Young's decision 
and switch to the DDC. Miksa states that 

the chief difficulty in the consideration was the 
necessity that any scheme adopted be shaped 
to the particular needs of the collections of the 
Library itself. If the Dewey Decimal Classification 
were to be used, many changes would be required 
in it. But Dewey was unwilling to allow any signifi- 
cant change. He believed that making alterations 
would be unfair to those libraries already using his 
system. Thus he required that it be adopted with 
only minor changes. 25 
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To Putnam's disappointment, he had to abandon the 
idea of using the DDC at the LC. 

The Library of Congress 

The LC in 1876 was Ainsworth Rand Spofford. Cole wrote 
that 

for the most part, Spofford operated quite inde- 
pendently from the American library movement 
and the American Library Association itself. The 
primary reason was, quite simply, that he did not 
have the time to participate. . . . Spofford's inde- 
pendence from other libraries and librarians was 
accentuated by his idea of a national library as 
well as by his personal temperament. He believed 
the Library should be, essentially, a comprehen- 
sive accumulation of the nation's literature, the 
American equivalent of the British Museum and 
the other great national libraries of Europe. He did 
not view it as a focal point for cooperative library 
activities and was not inclined to leadership in that 
direction. Furthermore, his personal enthusiasms 
were acquisitions and bibliography. 26 

Spofford's contribution to the eventual success of the 
card distribution program should not be overlooked. He was 
the person responsible for gaining congressional approval 
for a massive expansion of the collections of the LC, most 
notably through the copyright amendment of 1865 and the 
copyright law of 1870, which required copyright deposit 
at the LC. The card distribution program would not have 
succeeded if it had not been based on the comprehensive 
and continuously increasing collections at the LC. However, 
"Spofford's administration between 1872 and 1897 was 
dominated by the unceasing flow of materials into cramped 
quarters." 2 ' Because of this and because his staff was so 
limited (in 1897 it consisted of forty-two employees, twenty- 
six of whom worked full time on copyright), the ALA was 
discouraged from looking to the LC for distribution of cata- 
loging in 1876. 28 

Besides copyright deposit, Spofford's second major 
contribution was a new building for the LC. In 1896, the 
Joint Committee on the Library of Congress held hearings 
concerning the condition of the LC on the eve of its move 
into its new building. Cole described the way the ALA, led 
by Dewey and R. R. Bowker, took this opportunity to "exert 
its influence in the reorganization that obviously would take 
place once that spacious, modern structure was occupied." 29 
Bowker persuaded the Joint Committee to invite the ALA 
to send witnesses to testify at the hearings. Among these 
witnesses were Dewey and Putnam. Cole stated that "both 
men carefully avoided direct criticism of Spofford, but 



nonetheless their view of the proper functions of a national 
library clearly differed from that of the aging Librarian 
of Congress." 30 Among the functions of a national library 
detailed for Congress by Dewey and Putnam were central- 
ized cataloging, interlibrary loan, a national reference and 
bibliographic center, and a national union catalog. 

Spofford never had the money or the staff to catalog the 
LC collection adequately. After going through a succession 
of book catalogs, the last one of which remained incomplete 
at the letter c, an author-title card catalog, not accessible to 
the public, was begun. 31 The real guide to the collection, 
however, was Spofford himself, who was known for his 
phenomenal memory and extraordinary knowledge. 32 At the 
conclusion of the hearings in 1896, Putnam recommended 
that "an endeavor should now be made to introduce in the 
Library the mechanical aids which will render the Library 
more independent of the physical limitations of any one man 
or set of men; in other words, that the time has come when 
Mr. Spofford's amazing knowledge of the Library shall be 
embodied in some form which shall be capable of rendering 
a service which Mr. Spofford as one man and mortal can not 
be expected to render." 33 The era of the librarian who could 
know every book in the library had come to an end; it was 
time to supplement the librarian with the "machine," in this 
case, the public card catalog. 

As a result of these hearings, the LC was reorganized 
and expanded, and the office of Librarian of Congress 
"gained the unique powers that exist to this day. . . . The 
Librarian was given sole authority and responsibility for 
making the 'rules and regulations' for governing the 
Library." 34 Spofford was replaced by John Russell Young, a 
journalist who was a friend of President McKinley. The fact 
that he was not a librarian was a setback for the profession, 
but, despite poor health that resulted in his death in 1899, 
Young made some important decisions. He was responsible 
for the appointment of two key people in the development 
of the cataloging program at the LC, J. C. M. Hanson and 
Charles Martel. At the advice of Hanson and Martel, Young 
made the decision to develop a new classification scheme 
for the LC, the LC Classification, rather than using the 
DDC already in use by American libraries. For better or 
for worse, this decision was to have a far-reaching effect on 
American library practice. 

When Young died, it was again necessary to appoint a new 
Librarian of Congress. This time the ALA took a hand in the 
appointment. A number of writers have detailed the compli- 
cations that ensued. 35 The ALA got its way, and Putnam was 
appointed. Putnam's testimony before Congress as an ALA 
spokesman has been quoted above. He had already served as 
president of the ALA in 1897-98 and would again in 1903^4. 
There is no question that Putnam was the ALA's man. Putnam 
immediately set about creating a national library according to 
the ALA's definition of a national library: a definition that 
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dealt not just with the collections, but with service based on 
the collections. The newly defined powers of the Librarian of 
Congress allowed him to create this de facto national library 
somewhat independently of Congress. On paper, the LC was 
still the Library of Congress, not the library of the nation as 
a whole, but by instituting services such as the card distribu- 
tion program, Putnam committed the LC to actions that 
defined it as a national library in fact (de facto), even if this 
was not recognized by law (dejure). However, it must also 
be recognized that Congress, by appropriating the money 
to hire the staff necessary to institute centralized cataloging, 
by passing the legislation that authorized card distribution, 
and by approving Putnam's appointment in the first place, 
tacitly approved. It should also be noted that appropriations 
suitable to meeting the national obligations of the LC (or, 
to put it another way, disproportionate to the narrow role of 
Congress's library) have been made by successive Congresses 
ever since. Cole noted that, at the time, "the political climate 
was right and the country was in an expansionist mood," and 
these must have been factors in Congress's tacit approval. 36 
Certainly, Putnam was able to obtain a tremendous increase 
in the direct appropriation for the LC and its staff. According 
to the Report of the Librarian of Congress for the Fiscal Year 
Ending June 30, 1900, the appropriations for the LC went 
from $291,625 to $513,553 between 1899 and 1901. 37 As a 
result, the staff in the Catalogue Division increased from 
fifteen in 1898 to ninety-one by 1902. 38 

In looking at a highly successful program in retrospect, 
one can easily forget the courage it took in the beginning 
to commit always scarce resources when success was by no 
means assured. When Putnam took over the LC, he took 
over the same state of disarray that had prevented Spofford 
from volunteering the services of the LC to the libraries 
of the nation. An immense recataloging program had just 
begun and this, plus the cataloging of the titles that had 
never been cataloged, would take years to complete. The 
LC had just begun to use a new classification scheme in 
1895 to catalog subjects using a list of subject headings (the 
List of Subject Headings for Dictionary Catalogs, first pub- 
lished in 1895, as adapted by the LC), and to plan for the 
use of new descriptive cataloging rules. 39 

As described above, a number of previous centralized 
and cooperative cataloging schemes had failed over the 
previous twenty- five years. Is it an illusion, or is a note of 
doubt present in Putnam's voice in the following quotation 
from 1901? 

A general distribution of the printed cards: That has 
been suggested. ... It may not be feasible: that is, it 
might not result in the economy which it suggests. 
It assumes a large number of books to be acquired, 
in the same editions, by many libraries, at the same 
time. In fact, the enthusiasm for the proposal at 



the Montreal meeting last year has resulted in but 
sixty subscriptions to the actual project. It may not 
be feasible. But if such a scheme can be operated 
at all, it may perhaps be operated most effectively 
through the library which for its own purposes is 
cataloging and printing a card for every book cur- 
rently copyrighted in the United States. 40 

The fact that Putnam proceeded with the card distri- 
bution program despite an initial "disheartening" response 
from the library community led Archibald MacLeish to 
describe his action as "notable for its courage." 41 

Bowker guaranteed the LC $1,000 to cover any deficit 
it might incur in the first year of the program. 42 The cards 
were to be sold at cost, plus 10 percent. The 10 percent was 
added to the legislation that authorized the card distribution 
by the public printer, F. W. Palmer. 43 Putnam's justifica- 
tion for this and other programs carried out by the LC in 
its capacity as "national library" is interesting today in the 
context of controversies over public sector versus private 
sector activity in the information field. Putnam wrote, "The 
national library for the United States should limit itself to 
the undertakings which cannot, or cannot efficiently, or can- 
not without extravagance be carried on by the several states 
or smaller political sub-divisions; or (since libraries are a 
frequent and common form of private benefaction) are not 
adequately cared for by private endowment." 44 

From the beginning, the centralized cataloging done 
at the LC and distributed to the libraries of the nation 
included cooperatively produced records. At first, other 
government libraries were asked to contribute catalog copy, 
which was edited at the LC and distributed in the form of 
printed cards. The first was the Department of Agriculture 
Library in 1902, and others followed. In 1910, libraries that 
had been designated as depository libraries and received 
a complete set of LC cards — to distribute access to the 
national bibliography throughout the country — were asked 
to supply catalog copy for books not in the LC's collec- 
tions. 4 Although these cooperatively produced records 
never constituted a large proportion of the distributed cards, 
they did set a precedent for such present-day projects as 
the Program for Cooperative Cataloging's Monographic 
Bibliographic Becord Program (BIBCO, www.loc.gov/catdir/ 
pcc/bibco/bibco.html), Name Authority Cooperative 
Program (NACO, www.loc.gov/catdir/pcc/naco/naco.html), 
and Subject Authority Cooperative Program (SACO, www 
.gov.loc.org/catdir/pcc/saco/saco.html), projects in which 
catalogers outside the LC contribute significantly greater 
numbers of catalog records, name authority records, and 
subject authority records to the national bibliography. 
Another related cooperative effort was the National Union 
Catalog (NUC), which began at the same time as card dis- 
tribution because Putnam asked four large research libraries 
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to exchange their own printed cards with the LC. The NUC, 
housed at the LC, was thus even more complete than the 
depository sets of cards distributed throughout the country, 
and was used as a point of last resort for interlibrary loan and 
to supply cataloging to libraries whose requests could not be 
satisfied with cards from the LC stock. 46 



Success— and Why 

By 1905, even before the standard cataloging rules had been 
published, Putnam was able to report considerable success 
in the cataloging distribution program: 

The sale of these cards to other libraries began, 
you will recall, three and one-half years ago. We 
have not sought to press it for three reasons: (1) 
Because the distribution involves to the Library of 
Congress an expense and some inconvenience not 
at all reimbursed by the subscriptions received; 
and (2) because the cards at present cover but a 
fraction of the existing collection, and (3) because 
our methods and rules of entry are still undergoing 
revision, and we did not covet the task of explaining 
changes or of satisfying subscribers as to inconsis- 
tencies. We have not, therefore, sought to push the 
sales. They have, however, increased each year in 
almost geometric proportion. 47 

Scott detailed several reasons for this success. First, 
card catalogs were replacing book catalogs at this period, 
and the card distribution program came along at just the 
right time to hasten the transition. Second, the LC was able 
to set up a permanent card distribution staff, which enabled 
them to allow librarians to order just the cards they wanted 
rather than require them to subscribe to all the cards as ear- 
lier schemes had. Edlund describes in detail how elaborate 
the card distribution service was eventually to become. 48 
Third, as Scott puts it, "the entries were legitimatized both 
as emanating from the national library and as conforming to 
current cataloging practice." 49 One suspects a chicken-and- 
egg situation here in which the standard cataloging practice, 
which the ALA had been so active in establishing, legitimized 
the cards, and the cards, when widely adopted, ensured that 
the national standard was a widely used standard — and thus 
a more powerful one. Hanson suggested a fourth reason 
for the popularity of the cards, already alluded to above. 
As head of cataloging at the LC, he received many letters 
concerning cataloging, and from these he was "tempted to 
conclude that a large proportion of the subscribers have 
been led to adopt the printed cards because they value the 
suggestions in regard to subjects." 50 

Edlund suggested several other factors that may have 



contributed to the success of the program. 51 In 1904, the 
LC agreed to publish on the ALAs behalf a new edition 
of the A.L.A. Catalog. 52 This was one of Dewey's pet proj- 
ects and consisted of cataloging for eight thousand "best 
books" recommended by the ALA for a small library. The 
1904 edition contained LC card numbers for all eight 
thousand volumes and, in conjunction with the publication 
of the catalog, the LC offered to sell cards for the entire 
set for one lump sum. 53 The A.L.A. Catalog and the LC 
cards appeared on the scene in the midst of the Andrew 
Carnegie period of American libraries. Edlund points 
out that between the years 1890 and 1917, the Carnegie 
Foundation gave more than $41 million for the construc- 
tion of twenty-five hundred libraries in small towns all 
over the country. He observes, "Often they were part-time 
libraries, run by part-time personnel, frequently with only 
a part-time knowledge of the principles and practices of 
operating a library. To some of these people, 'catalog' and 
cataloging' were not exactly household words, so they were 
prime candidates for whatever assistance the Library of 
Congress could provide." 54 Given these circumstances, it 
is hardly surprising that "the response to the publication of 
the catalog was of landslide proportions." 55 

Last, but not least, a major factor in the success of the 
card distribution program was the comprehensive scope 
of the collections of the LC, which were continually being 
increased by copyright deposit. In Putnam's words, "A 
collection universal in scope will afford opportunity for 
bibliographic work not equalled elsewhere." 36 Instead of 
being restricted to current publications of U.S. publishers, 
as libraries would have been if card distribution through the 
publishers had been a success, cards were available for all 
additions to the LC and, as time passed, for all previously 
cataloged books. In addition, the beginning of the card dis- 
tribution program coincided with a massive recataloging 
effort at the LC as the old card catalog was converted to the 
new printed cards and as the collection was classified using 
the new classification scheme. 

Putnam himself identified what must have presented 
something of a paradox to those in charge of collection devel- 
opment at the LC when he stated, "To supplement other 
collections for research your national library must have the 
unusual book; to enable its cataloging work to be serviceable 
to other libraries of varying types, it must have the usual 
book." 3 ' In other words, the LC should collect everything! 
Putnam even went so far as to suggest that "it would pay this 
great community, through its central government, to buy a 
book for the mere purpose of cataloging it and making the 
catalog entry available in these printed cards, even if the 
book should then be thrown away." 38 In fact, the LC was 
never able to implement such a collection policy. Charles 
Harris Hastings, head of the Card Division for thirty-seven 
years, apparently tried to push for something similar: 
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The Card Division also desired to have non- 
copyrighted books purchased on the strength of 
orders received for cards, instead of waiting for 
them to be ordered for the reading-room ser- 
vice, or on the recommendations of the chiefs of 
divisions. The Chief of the Accessions Division, 
Superintendant of the Reading Room, and other 
officials maintained that the Library would be 
flooded with popular books and suffer serious 
financial loss if the change was made. 59 

In 1902, Hastings announced to the ALA Annual 
Conference that "the fact is recognized by those having to 
do with the ordering of books at the Library of Congress 
that it, being primarily a reference library, can never hope to 
buy and never ought to buy many books which may properly 
be bought by public libraries." 60 

Concluding Thoughts 

In 1876, the United States was, according to Frederick 
Leypoldt (editor of Publisher's Weekly), "almost the only 
civilized country . . . not represented by a national bibli- 
ography." 61 Speaking at the Waukesha Conference of the 
American Library Association in 1901, Dewey said, 

You remember that when the Pacific railroad was 
built, and as the ends came together to make the 
connection, a great celebration was held through 
the country, a thrill that the work was at last done; 
and I feel today, now that we hear in this able 
report that printed catalog cards are really to be 
undertaken at the National Library, that what we 
have waited for over 20 years, and what we have 
been dreaming about has come to pass at last. 62 

The solution to the problem of creating a national 
bibliography seems peculiarly American, and Dewey's com- 
parison with the mechanical and technological triumph 
in Ogden, Utah, singularly appropriate. Putnam, too, saw 
the triumph as being mechanical in nature. He wrote, 
"American instinct and habit revolt against multiplication of 
brain effort and outlay where a multiplication of results can 
be achieved by machinery." 63 In a sense, the LC cards were 
interchangeable parts for libraries. Standardization made it 
possible for the smallest library in the country to have the 
same quality of cataloging as the largest research library. In 
this, the card distribution program was profoundly demo- 
cratic. Every American citizen who used a public library 
could benefit from the expertise that went into creating the 
national bibliography at the LC. 

Every silver lining has a cloud, however. The card 



distribution program marked the end of an era when "librar- 
ian" meant a person who both cataloged and administered 
a library, and thus was an incomparable guide for the user 
through his or her library. With cataloging centralized at the 
LC, the fears of librarians such as Frederic Vinton came to 
pass, to some extent: "We fear that the so much desiderated 
object of co-operative cataloguing (by which each librarian 
shall have the least possible writing to do) is unfavorable to 
good librarianship. For myself, I would on no account lose 
that familiarity with the subjects and even the places of my 
books which results from having catalogued and located 
every one." 64 Henderson pointed out that the creation of the 
ALA Advisory Committee on Cataloging Rules to create the 
1908 code led to the separation and isolation of catalogers 
from administrators and stated that "before 1900, cataloging 
was a concern of all of the ALAs members, since the issues 
were discussed in general meetings." 65 According to Bishop, 
"Classification and cataloging occupied the major part of the 
curriculum in the early years of training in library science. 
They were definite matters which could be taught, and 
they were controverted subjects which awakened intense 
partisonship." 66 Today cataloging is practiced mainly at the 
LC and by a tiny corps of librarians primarily located in 
large research libraries. Most librarians learn little about 
cataloging in graduate school and go on to administer librar- 
ies, teach children to read, and provide reference service to 
the public without bothering to learn how to use their own 
catalogs properly and without bothering to follow catalog- 
ing issues or comment on them. When the LC recently 
considered abandoning the systematic cataloging of trade 
publications to focus on digitizing their backlogs of rare and 
unique materials, few librarians other than catalogers took 
notice. 6 ' The loss of cataloging expertise on the part of most 
librarians resulting from the efficiencies achieved by means 
of a greater division of labor was probably inevitable. The 
change would surely have come about eventually under the 
crush of the information explosion of the twentieth century, 
but surely there is no harm in lamenting with Cutter the 
passing of a "golden age," especially now, when the very 
existence of human intervention for information organiza- 
tion is under constant threat while most of the library pro- 
fession has little understanding of the danger the loss of it 
would pose for their existence as a profession. 68 

Doing the research to write this paper prompted this 
author to ponder the changing cataloging landscape. The 
final section of this paper explores the current scene and 
the possible future of shared cataloging, asking the follow- 
ing questions: Are the same forces operating today as were 
operating at the turn of the last century? Are they operat- 
ing in the same way or in different ways? At the turn of the 
century, the United States was prosperous, powerful, and 
in an expansionist mode. It was the era of the Progressives, 
who argued that the business of government was to advance 
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the health and welfare of its people; technologies new at 
that time were harnessed to serve these goals. Some might 
argue that, at present, the LC serves a government that is 
dominated by those who wish to shrink all aspects of gov- 
ernment that are not part of the military industrial complex. 
While funding available for the LC's technical services 
remains the same, a change in the internal priorities of the 
LC now directs more of those funds to digitization projects 
and much less to cataloging. Apparently, cataloging is now 
seen as a part-time activity to be done by staff who are 
also responsible for acquisitions tasks, including electronic 
resource license negotiation. 

In addition, the LC is now situated in an information 
universe in which more pervasive technologies have come 
more and more to set their own agendas. The presence of 
Google on the scene seems to be an indication that there 
are businesspeople who think that there might be money 
to be made by competing with libraries in the provision of 
information to the public, and Google's popularity seems to 
indicate that for many ordinary people convenience takes 
priority over precision, recall, and even accuracy when it 
comes to information access. It seems possible that the 
future customers of libraries will no longer be the public at 
large, but only that small elite consisting of people who do 
serious research, and in a democratic society it is hard to 
get funding to support the work of a small elite, even one as 
important as this one to our future progress and prosperity. 

To this author, it appears that the ALA is now domi- 
nated by library administrators with shrinking budgets who 
know very little about the complexities of bibliographic con- 
trol (other than its expense) and who wonder if the fact that 
undergraduates are in love with Google might not provide 
an excuse for libraries to dispense with the information- 
organization part of their budget entirely. 

The publishing industry may still be reluctant to invest 
in the creation of standardized and detailed cataloging (or 
metadata), just as it was in the nineteenth century, judging 
by the fact that Online Information Exchange (ONIX) is still 
not widely implemented and by the fact that descriptions 
in Amazon.com are so rudimentary that it is not possible 
to distinguish one edition from another, or even to find all 
of the editions of a given work if the author's name or title 
varies. 69 The publishing industry and other content provid- 
ers also appear to be actively involved in shrinking the com- 
mons by extending copyright limits and by more jealously 
protecting their intellectual property rights, making it more 
difficult and expensive for libraries, archives, and museums 
to provide communities with online access to their digital 
holdings through cataloging records. The old partnership 
between libraries and publishers in all formats — in which 
libraries served to popularize published works by making 
them available to more people and created more customers 
for publishers by encouraging higher literacy rates — may 



be breaking down now that publishers have other ways of 
reaching potential customers directly. Most publishers are 
essentially for-profit organizations and, as such, probably 
care little about the fact that those who cannot pay their 
high fees will no longer have access. In this context, it is 
interesting to look back at Melvil Dewey's argument that the 
government should promote the interests of libraries over 
those of publishers because libraries deliver more education 
and civilization to the public for less money than would be 
the case if publishers alone were responsible. ro One suspects 
that, in the current era, our government no longer places 
such a high value on educating its citizens that it would 
decrease the profits of publishers in the way it was willing 
to in Dewey's day. 

OCLC has largely replaced the LC's card distribution 
program as the mechanism by which LC cataloging is shared 
with the nation's libraries. If the LC were eventually to aban- 
don the cataloging of trade publications, the question arises 
as to whether the great research libraries and the remaining 
public libraries would follow the LC in abandoning catalog- 
ing. Would OCLC continue to be viable without LC copy? 
And without the LC at the center, would cataloging con- 
tinue to be done in a standard and sharable way? Already, 
many would argue that the cataloging of audiovisual materi- 
als found in OCLC shows less standardization than that of 
monographs largely because of the lack of a supply of LC 
cataloging copy for audiovisual materials. 

This author has written elsewhere of her fear that the 
rise of the Internet may threaten the profession of librarian- 
ship and the value it places on access to the cultural record 
for all — regardless of socioeconomic level — in order to 
ensure an informed citizenry.' 1 However, the Internet is 
a tool that can be used either foolishly or wisely. It also 
has the potential to allow cooperative cataloging to thrive 
in a much more efficient fashion in the future. Currently, 
cataloging practice is very repetitive. Every time a new 
edition of a work is published, a cataloging record for that 
new edition is created that repeats much of the information 
already found in the cataloging records for all the other 
editions of that work. Newer conceptual models of catalog- 
ing, such as Functional Requirements for Bibliographic 
Records (FRBR), Functional Requirements for Authority 
Data (FRAD), and Functional Requirements for Subject 
Authority Records (FRSAR), as well as the related model 
underlying Resources Description and Access (RDA), are 
based on the hope that libraries, archives, and museums 
may be able to raise cooperative cataloging to a new level of 
efficiency by using the possibly emerging Semantic Web to 
share in the creation of entity records (or the record equiva- 
lent in the Semantic Web, the uniform resource identifier 
or URI) for works, authors, subjects, places, and the like. 72 
It is even possible that we could share the work of entity 
description with people who are not librarians, catalogers, 
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archivists, or museum scientists, such as subject experts, 
bibliographers, and the like. While we would still need to 
ensure that only people willing to learn how to practice 
accurate entity identification and how to choose commonly 
known names for entities as preferred forms should be 
allowed to have editing privileges, we could collect sugges- 
tions for variant forms not yet linked to preferred forms or 
for corrections to our entity definitions from anyone who 
took an interest, and we could encourage everyone in the 
world to link to our entity definitions when citing an author, 
work, subject, or class; it should be a lot easier for a nonli- 
brarian to link to the appropriate URI than to have to use 
the correct string of text, as is currently the case. If these 
entity records performed the same searching function as 
our authority records currently do, allowing a user to search 
for a particular entity using any extant variant of the name 
of that entity in any language, these more efficiently created 
catalogs could also perform better than ever before. 

It remains to be seen whether we will use our new tools 
foolishly, to create a new "dark ages" in which much of the 
cultural record is either lost or hidden from view, or wisely, 
to advance the welfare of humanity and create a world in 
which all of its people, regardless of socioeconomic level, 
enjoy and make use of humanity's entire cultural record. 
What is at issue are the goals we wish to achieve as a soci- 
ety and whether we will direct our current technologies to 
serve those goals or rather abandon those goals in favor of 
allowing the technologies to set their own agendas. The 
economic, political, and social factors that predominate 
in our current society at the turn of the millennium will 
determine our choice in the same way that our choices were 
determined in 1900. As always, it is up to us to choose the 
kind of society we want. 
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Series Authority Control 
at Oregon State 
University after the 
Library of Congress's 
Series Policy Change 

By Richard E. Sapon-White 

The Library of Congress (LC) decided to suspend creating series authority records 
on May 1, 2006 and to transcribe all future series statements as untraced. To evalu- 
ate the effect on cataloging workload at Oregon State University (OSU) Libraries, 
bibliographic records were examined for untraced series statements from June 
1, 2006 to December 31, 2007. Series titles were then searched in the Library of 
Congress Name Authority File (LCNAF) and corrected to match the authority 
record, if necessary. Series titles not found in the LCNAF were evaluated according 
to current cataloging rules and corrected if necessary. Of the 53,911 records added 
to OSU Libraries' catalog during the study, 977 (2 percent) had an untraced series 
statement. Only 60 (6 percent) of the 977 were records created by the LC after the 
2006 decision. The majority of records (64 percent) with untraced series statements 
were records created by the Government Printing Office. Many untraced series 
were also found in records for materials with publication dates before 2000, most 
resulting from a serials retrospective conversion project. The data suggest that the 
LC's policy change has not created a large cataloging burden and, with relatively 
little effort, OSU Libraries catalogers are able to continue to provide users with 
authorized series title access. 
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On April 21, 2006, an announcement from the Library of Congress (LC) 
described the decision to suspend the creation of series authority records 
on May 1, 2006. 1 Nearly all series titles were to be transcribed as "untraced" in 
Machine-Readable Cataloging (MARC) field 490 (series statement) with first 
indicator "0." The LC would continue to provide training in the creation of series 
authority records. The LC's reasons for making the decision included cost sav- 
ings and the argument that indexing and keyword searching were adequate to 
provide access to series information. The announcement mentioned no studies 
that evaluated the potential effect of the decision on libraries wishing to maintain 
series authority control. 

In response to the announcement, members of the library community 
expressed concern about the decision, pointing out that they had not been con- 
sulted and were not provided any opportunity for comment. 2 They argued that 
many integrated library systems lacked the ability to search untraced series titles, 
an ability that might otherwise mitigate the effect of the decision on users who 
relied on series title access in catalogs. Some public service librarians stated that 
series titles were an important access point for users, with some users tracking 
specific series of interest. With the loss of series title access, these users would 
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experience frustration searching for series of interest to 
them. 

Reaction by some in the library community was swift 
and furious. A petition circulated on the Web was signed 
by many notable librarians. 3 The Library of Congress 
Professional Guild, the union representing the LC's work- 
ers, passed a resolution asking the library's administration to 
reconsider the decision. 4 Mann, an LC reference librarian 
and a noted speaker on library research methodology, also 
added his voice to the criticism over the series decision. 5 
The American Library Association (ALA) executive board 
issued a statement commenting that controlled access to 
series information was an important way for library users to 
discover information. The board acknowledged the impor- 
tance of the LC's cataloging, both in quality and quantity, 
and the fact that a lessening of either of these would have 
significant consequences for the finances of American librar- 
ies. 6 The Association for Library Collections and Technical 
Services criticized the decision as well. Only the Program 
for Cooperative Cataloging (PCC) response was less critical, 
asserting that the LC had the right to make such decisions 
independent of the library community. 7 

Ultimately, the LC refused to reverse its decision, but 
did agree to delay implementation of the changes until 
June 1, 2006. 8 OCLC's response was to allow all libraries 
to edit series fields in LC records to provide quality control 
for series headings. 9 Additionally, the untraced series field 
in LC records would not overlay series tracings in OCLC 
records when the LC does copy cataloging. 

To provide the same level of series title control as 
existed prior to the LC's series decision, libraries would 
need to perform the authority work that the LC had been 
doing previously. This would involve checking series state- 
ments against the Library of Congress Name Authority 
File (LCNAF) and revising bibliographic records to match 
the authorized headings. These activities require time and 
skilled personnel. Taking on these additional activities when 
personnel are already busy with their existing duties was a 
great concern in many libraries. How much of an additional 
burden would providing series control be in this new cata- 
loging environment? 

At Oregon State University (OSU) Libraries, an investi- 
gation was begun to assess the effect of the LC's series deci- 
sion on cataloger workload. This study seeks to answer the 
following questions about the bibliographic records OSU 
Libraries downloads to its local catalog: 

• How many untraced series are being added to the 
OSU Libraries catalog? 

• What is the source of cataloging of bibliographic 
records with untraced series? 

• Do series authority records exist for these untraced 
series? 



• If a series authority record exists, does the form of the 
series title in the bibliographic record differ from the 
form in the series authority record? 

By answering these questions, the OSU Libraries 
hoped to determine if any adjustments needed to be made 
in staffing or workflow to ensure continued access to series 
for library users and staff. Other libraries can compare the 
situation at OSU Libraries to their own to evaluate how 
series title access may have been affected by the LC's deci- 
sion. They also may find the study's methods useful for 
conducting their own research on this issue. 

Literature Review 

The ALA executive board had expressed concern about the 
lack of time available for libraries to prepare for the change 
in series authority treatment. It stated that libraries needed 
that time to determine the effect of the decision and the 
options for providing continuing series authority control. 10 
To date, no formal studies have been published. The fol- 
lowing literature review covers opinions that have been 
expressed about the decision's potential consequences for 
libraries as well as a set of responses to an informal survey 
about changes to local practices following the series deci- 
sion. The survey was distributed through the PCC electronic 
discussion list, PCCLIST, which is accessible only to PCC 
members and focuses on cooperative cataloging issues. 

Reference librarians have commented on the adverse 
outcome the decision makes for providing reference service 
to patrons. Although stopping short of chastising the LC, 
Mitchell and Watstein list the many ways in which series 
authority control affects users, including finding series, 
classifying works, distinguishing between series and sub- 
series, and supporting the incorporation of the principles of 
Functional Requirements for Bibliographic Records in cata- 
logs. 11 They point out that in FY2004, bibliographic records 
created by the LC included 82,447 series statements. 
Donlan notes that the PCC creates more series authority 
records than the LC, but also points out that the organiza- 
tion's members would have to nearly double output of such 
records to sustain the number created before the LC's policy 
change. 12 She suggests that libraries may need to hire cata- 
logers to counter the effects of the LC's series decision. 

McElfresh points out that users are hindered by key- 
word searching and Google-like search engines. 13 She cites 
Tillett regarding the importance of a controlled vocabulary 
for precision in searching and as being necessary for cre- 
ation of the Semantic Web. 14 McElfresh therefore argues 
that series authority control is important and, if it is to be 
continued, libraries must commit to taking on the task 
no longer being shouldered by the LC. By sharing the 
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responsibility of series authority work, the effect of the LC's 
decision can be lessened. However, she does not attempt to 
assess how much work this adds to cataloging departments 
or how they are to finance this added burden. 

In the fall of 2007, a year after the LC's announcement, 
a query sent to the PCC discussion list asked list members 
for input on how their libraries were handling the series 
issue. Staff at the University of Colorado at Boulder have 
continued their practice of providing series authority con- 
trol by creating new series authority records and editing 
bibliographic records with untraced series, which should 
have been traced. They have found a small increase in 
the number of new series authority records that they need 
to create, but nothing overwhelming. They found that of 
fifteen hundred new records, only twenty-four (less than 2 
percent) had untraced series headings. They, as well as some 
other responders to the survey, indicated that they are look- 
ing into outsourcing their authority control work. 

New York University copy catalogers search all series and 
flag anything new to their system for later review by serials- 
experienced copy catalogers. 1 ' Original catalogers continue 
to create new series authority records. Their impression is 
that there have been relatively few new series for which 
they would have expected the LC to create an authority 
record in the past. Duke University catalogers reported that 
much of their authority control processes are automated. 1 ' 
The effect of the LC's series decision was small, resulting in 
manual corrections to five series over the course of a quarter 
that were deemed to be because of the LC's suspension of 
series authority record creation. Northwestern University 
has a loader program that compares untraced series head- 
ings with authority records and automatically adjusts the 
series tracing as appropriate. 18 If it cannot perform this 
action, a cataloger is notified by e-mail. The program also 
now compares field 490 (with a first indicator of "0") with 
existing authority records. Review of e-mail messages takes 
only minutes per day. In the year since the LC's authority 
decision, the number of series authority records needing 
to be created has only increased slightly. The University of 
Georgia, University of Florida, and Indiana University did 
not comment on the effect of the decision in their libraries, 
but did say that they continue to trace series because they 
consider series to be an important access point for public 
and other functions of their libraries. 19 



Setting 

OSU is a land, sea, sun, and space grant institution with 
approximately nineteen thousand students and eighteen 
hundred faculty. The OSU Libraries' holdings include more 
than 1.4 million volumes, 14,000 serial subscriptions, and 
more than 500,000 maps and government documents. A 



main library and veterinary medicine library on the main 
campus are complemented by two branch libraries serving 
remote facilities of the university. 

The Technical Services Department includes serials and 
monographs cataloging units and a digital production unit in 
addition to acquisitions units. The three units involved in cat- 
aloging include 2.5 full-time equivalent (FTE) catalog librar- 
ians and 8.5 FTE paraprofessionals. The cataloging units 
participate in the Program for Cooperative Cataloging Name 
Authority Cooperative Program (NACO), Subject Authority 
Cooperative Program (SACO), and Cooperative Online 
Serials Program (CONSER). OSU Libraries has not yet been 
declared independent for creation of series records. 

OSU Libraries acquires approximately fifteen thousand 
monographs annually in addition to receiving about five 
thousand government documents. Of the firm ordered and 
approval plan monographs, approximately 85 percent have 
cataloging copy (contributed either by the LC or a member 
library) available through the OCLC bibliographic database 
with full-level cataloging, including call numbers and sub- 
ject headings. Most cataloging copy is not scrutinized for 
authority control and is downloaded in a "fast-cat" process 
by a lower-level paraprofessional. The remaining 15 per- 
cent require subject analysis or original cataloging or both. 
Original cataloging is done by a monographs cataloger and 
upper-level paraprofessionals; series encountered during 
original cataloging are searched in the LCNAF and traced 
according to the series authority record or, if no authority 
record is found, according to Anglo-American Cataloging 
Rules, 2nd ed. (AACR2) and Library of Congress Rule 
Interpretations (LCRI). 20 

Research Method 

Between June 1, 2006, and December 31, 2007, new biblio- 
graphic records downloaded into the OSU Libraries' catalog 
were reviewed periodically for the presence of untraced 
series statements (MARC field 490 with a second indicator 
of "0"). This was done using the "create lists" function in 
the OSU Libraries' Innovative Interfaces integrated library 
system. The list was then sorted alphabetically by series title 
for ease of review. The printed reports listed the source of 
the catalog record, title proper, series statement, and pub- 
lication date. 

Each series title was then searched against the LCNAF, 
which is accessible through OCLC Connexion. If the form 
of the series title on the authority record matched the form 
in the series statement, the field on the bibliographic record 
was revised to a series statement/title-added entry (MARC 
field 440). Proposed changes to MARC 21 to accommo- 
date a new method of recording and tracing series titles 
were approved by the Machine-Readable Bibliographic 
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Information Committee (MARBI) at the ALA Annual 
Conference in Anaheim in June 2008. This occurred during 
the drafting of this paper. Since this study was conducted 
before these changes were approved, and since the changes 
have not yet been put into effect, this paper reflects long- 
standing practices of series title transcription and tracing. 

If the form of the series title in the LCNAF did not 
match the form in the bibliographic record, the second 
indicator was changed to "1" and a field 830 (series added 
entry — uniform title) added to the record with the autho- 
rized form of the series title. 

If no LCNAF record was found for the series title, the 
library's catalog was checked to see if records with the series 
title already existed in the database. If they did, the cata- 
loger edited the record to make sure that the new record's 
series tracing matched the records in the catalog. If no other 
records were found in the catalog with the same series title, 
the series was traced as it appeared on the piece (i.e., chang- 
ing the field 490, first indicator "0," to field 440, second indi- 
cator "0") unless the title was generic. Generic titles, such as 
Annual report, were revised to follow AACR2 and LCRI in 
constructing a uniform title by changing the first indicator to 
"1" and adding a field 830 with the necessary qualifier. 

A tally was made of the treatment each series title 
received. Data were then entered into a spreadsheet. For 
each of the eight times untraced series were reviewed dur- 
ing the study, the following data were determined: 

• number of new records entering the system 

• number of new records with untraced series 

• number of untraced series statements 

• number of untraced series statements by source of 
the bibliographic records (characterized as LC cata- 
loging after June 2006, Government Printing Office 
(GPO) cataloging, or other) 

• number with publication dates before 1999 

• number of series titles represented by an author- 
ity record in the LCNAF and how the bibliographic 
record was modified (changed to field 440 or fields 
490/830 or left untraced) 

• number of series titles not represented by an author- 
ity record in the LCNAF and how the bibliographic 
record was modified (changed to field 440 or fields 
490/830 or left untraced) 

Analysis of data indicated that a significant proportion of 
untraced series were present in older bibliographic records 
that entered the system because of retrospective conversion 
projects. To determine how much of the work of reviewing 
untraced series was because of these projects, works with a 
publication date prior to 1999 were examined as a separate 
subset of bibliographic records. Since most new purchases 
had publication dates in the past three years and most 



retrospective work involved works with publication dates of 
thirty or more years ago, 1999 seemed like a logical cutoff 
date to divide recent from older publications. 

Findings 

Of the 53,911 records added to OSU Libraries' catalog dur- 
ing the eighteen months of the study, 977 (2 percent) had an 
untraced series statement (field 490, first indicator "0"); see 
table 1. Of these, only 60 (6 percent) were records created 
by the LC after the 2006 decision was made (see table 2). 
The majority of records (64 percent) with untraced series 
statements came from the GPO. Many untraced series 
were found in older records (i.e., publication dates of 1999 
and earlier), the result of an ongoing OSU Libraries serials 
retrospective conversion project as well as the cataloging of 
older materials in the Atomic Energy Collection in Special 
Collections. These older materials totaled 266 records (27 
percent), although some may have overlapped with the 
GPO records just mentioned. 

Of the 977 records with an untraced series statement, 
545 (56 percent) were represented by an authority record 
in the LCNAF. For 96 (10 percent), the authority record 
was for an untraced series or a quoted note. This is the only 
group of series statements with authority records that were 
traced and coded correctly. Another 40 (4 percent) also 
should have been recorded as untraced series, although no 
authority record for the statement was found. This brings 
the total for untraced series titles recorded correctly to 136 
(14 percent); see table 3. 



Table 1. Untraced Series Statements in New Bibliographic 
Records, June 1 , 2006-December 3 1 , 2007 





Number of 
Records 


Percent 


Without field 490 00 


52,934 


98 


With field 490 00 


977 


2 


Total 


53911 


100 




Table 2. Sources of 
Series Statements 


Bibliographic Records 


with Untraced 


Source of 
Cataloging 


Number of 
Records 


Percent 


DLC, post-2006 


60 


6 


GPO 


625 


64 


Other 


292 


30 


Total 


977 


100 
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For another 334 records (34 percent), the series was 
recorded in the same form as either the LCNAF or as it 
would have been if current AACR2 rules had been con- 
sulted. In other words, these series statements were coded 
incorrectly when current cataloging rules were consulted. 
These were most easily converted to traced series in the 
system by changing the coding because the content of the 
fields was already correct. 

The remaining 494 records (51 percent) had the series 
recorded differently than appeared in the LCNAF or the 
way the series would be traced according to AACR2 rules. 
Changing these to their correct form and coding required 
the addition of a field 830. 

Over the course of the eighteen months of this study, 
approximately 54 records with untraced series statements 
were added to the catalog monthly. The time required to 
research these series titles in the LCNAF and make changes 
to the catalog was about two hours each month. 

Discussion 

The data answer the question as to the extent of work 
needed to ensure continuing authority control of series titles 
in a catalog at an institution such as OSU Libraries. Of the 
untraced series titles in the study, 57 percent had existing 
authority records in the LCNAF. For this proportion of 
records, an automated authority control service would have 
been able to do the authority work and thereby reduce the 
workload of correcting series titles. Only 25 percent of all 
untraced series titles differed from their authorized forms. 



Table 3. Presence of Series Statements in LCNAF 





Number of 
Records 


Percent 


Series in LCNAF 






Traced the same 


212 


22 


Traced differently 


237 


25 


Not traced 


96 


10 


Subtotal 


545 


57 


Series not in LCNAF 






Traced the same 


122 


13 


Traced differently 


257 


27 


Not traced 


40 


4 


Subtotal 


419 


44 


Total 


964 


101* 



*Note: percentages do not equal 100 because of rounding. 



If an integrated library system indexed MARC field 490, 
these would be the only ones that require correction to pro- 
vide access to the authorized series titles. 

An additional 27 percent would be traced differently 
if an authority record were created according to AACR2. 
For these series titles, a cataloger must either search for the 
correct authority record or construct a uniform title head- 
ing and record this data in MARC fields 490 and 830. In an 
integrated library system that indexes field 490, this 52 per- 
cent (the 25 percent of series titles in the LCNAF plus the 
27 percent not in the LCNAF) represents the work needed 
to make these series tides accessible in an online catalog. In 
a system that does not index the 490 field, an additional 35 
percent will require authority work and the retagging of the 
490 field, for a total of 87 percent of bibliographic records 
with untraced series. 

Even with this high percentage needing revision, 
relatively few records with untraced series titles are being 
added to the catalog. In about two hours each month, staff 
can identify these records, search for and download series 
authority records from the LCNAF, and then make any 
necessary changes to the bibliographic records. These pro- 
cedures are easily absorbed by staff in the current workflow. 
In a library using automated authority control services, the 
time needed to complete all corrections would be consider- 
ably less. 

This study identified three main sources of untraced 
series statements in bibliographic records: the LC, the 
GPO, and retrospective conversion projects that loaded 
older bibliographic records into the OSU Libraries catalog. 

The number of untraced series statements in LC records 
is very small, with approximately three or four records with 
untraced series added each month. Many LC records may 
initially have had untraced series statements that were 
revised by other libraries before being downloaded into the 
OSU Libraries catalog. The statement from some catalog- 
ers that having many libraries shoulder the additional series 
authority work caused by the LC's decision would soften the 
decision's effect is probably correct, but this would need to 
be verified in a different study. 

If the rate at which new series are created by major 
publishers remains the same as the rate during this study, 
catalogers should be able to keep up with series authority 
record creation and bibliographic maintenance. If, however, 
series title changes increase or the number of new series 
increases, series authority record creation and bibliographic 
maintenance would need to be increased as well. Expansion 
of series authority training programs would help mitigate 
this trend should it appear. 

As the LC revises its priorities and focuses its resources 
on serving Congress rather than acting as the de facto 
national library, the degree to which it perceives its suc- 
cess at abandoning series authority work may lead to its 
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abdication in other areas of library leadership. 

The GPO stated in June 2006 that it would not follow 
the LC's lead and would continue to create series author- 
ity records. However, the rate at which they do so must 
be slow, since so many of the series on GPO bibliographic 
records lack authority records. Also, many of the series in 
GPO records were coded as untraced when their authority 
records indicated that they should be traced. GPO does not 
appear to be following its own policy and is instead choosing 
to slow down its creation of series authority records as well as 
not tracing series that have existing series authority records. 
These practices have created the majority of records need- 
ing scrutiny and change in this study. Since OSU Libraries 
purchases most of its GPO records from a third-party ven- 
dor, it is looking into having the vendor authorize the series 
headings before sending them. 

Interestingly, a significant proportion of the series 
authority control work currently done at OSU Libraries 
appears to stem from projects involving older materials. 
Some records for these older materials reflect earlier cata- 
loging practices; others are minimal-level records probably 
created during other libraries' retrospective conversion 
projects. No series authority records are present in LCNAF 
for many of these older materials' series statements. Once 
the retrospective conversion of serials and special collec- 
tions is completed, the time needed to process untraced 
series should decrease noticeably. 

The burden of additional work predicted by some at 
the time of the LC's announcement has not materialized 
for OSU Libraries. On the other hand, other libraries' 
experiences could differ from OSU Libraries' depending 
on the type of library, types of materials collected, and the 
degree to which those materials are published in series. 
For example, a research institution collecting more gray lit- 
erature than OSU Libraries might see more series titles not 
represented in the LCNAF. Similarly, public libraries, which 
often collect children's books published in series, might find 
that authority records for these series titles are unavailable. 

Conclusion 

This study enabled OSU Libraries catalogers to learn about 
untraced series titles in the bibliographic records that are 
added to the catalog. The LC's decision to stop creating 
series authority records and to treat all series as untraced 
has had only a minor effect on cataloging workflow. Instead, 
a significant proportion of series authority work was gener- 
ated by the cataloging of federal documents, retrospective 
conversion projects, and the addition of older materials to 
OSU Libraries' collections. Government document records 
exhibit poor series authority control and require a significant 
amount of attention. Still, such authority maintenance does 



not appear to be a significant burden at this time. Series 
will continue to be tracked as records are added to the OSU 
Libraries' catalog. With the baseline provided by the current 
study, changes in series authority control workload should 
be detectable in the future. 

Research into the effect of the LC's decision should be 
conducted at other libraries, including larger institutions 
and other types of libraries, to determine if the findings 
from the OSU Libraries study can be replicated. Repeating 
this research in the future could also shed light on whether 
series authority record creation can keep pace with the 
appearance of new series. Such studies would provide a 
larger context in which to assess the effect of the LC's series 
authority decision on the cataloging community. 
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Notes on Operations 

Acquisitions Globalized: The 
Foreign Language Acquisitions 
Experience in a Research Library 



By Judit H. Ward 



This paper highlights foreign language titles from the perspective of acquisitions 
in a large academic research library. Selecting, ordering, cataloging, and provid- 
ing access to non-English materials reach beyond the boundaries of departments 
responsible for the individual tasks. Assignments require different levels of lan- 
guage proficiency ranging from bibliographic proficiency to the near-native pro- 
ficiency of the educated speaker. The highest level of language proficiency is used 
at the earliest and latest point of technical services (i.e., ordering and cataloging), 
and the rest requires only bibliographic proficiency or none at all. Because inter- 
national vendor experiences vary country by country, strong cooperation is criti- 
cal between the partners in the acquisition process. Vendor-supplied records used 
for foreign language acquisition purposes seem to have the potential to improve 
accuracy in bibliographic records. 

Foreign language titles have always attracted special attention in North American 
research libraries because of the uncommon nature of selection and acquisition 
methods. In an academic and research setting, the dissemination and enhanced 
access to scholarly information in foreign languages have never been more 
important than in the current, increasingly globalized information market and in 
ubiquitous international research cooperation. Changes in research patterns have 
created a greater diversity of needs and increased demand for foreign titles, and 
emerging and nontraditional research areas require a broader array of materials. 

Although an accurate and up-to-date assessment of North American collec- 
tions of foreign language materials seems to be difficult to complete at any given 
time, research has revealed declining foreign language acquisitions in North 
American academic libraries starting in the 1980s. 1 The 1985 Research Library 
Group Study singled out language as the most statistically significant factor in 
the failure to acquire books. 2 Other reasons include financial constraints (i.e., the 
deflation of the dollar against the world currency market and ever rising costs of 
materials) and political developments that influence collection strategies. 3 Since 
the 1990s, foreign language collections in the United States have increasingly 
been influenced by three main factors: political changes, such as the extraordinary 
transformations in Eastern Europe, the former Soviet Union, and Latin America; 
the effect of the global marketplace's growing need for personnel well trained in 
foreign languages and cultures; and the subsequently expanding boundaries of 
research, including a remarkable increase in collaboration between researchers 
in science and technology. 4 

Historically, several projects and programs were established after the foreign 
materials acquisitions crisis was discovered and documented. Organized and 
funded by a variety of sources, they aimed at sharing resources and cooperating to 
facilitate the development of coordinated and distributed collections in research 
libraries, including special regional and area-based cooperative programs such as 
the Association of Research Libraries Foreign Acquisitions Project, Seminar for the 
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Acquisition of Latin American Library 
Materials, Latin American Research 
Resources Project, and the Library of 
Congress (LC) National Program of 
Acquisitions and Cataloging. 5 Many of 
these initiatives approach cooperative 
collection development from a fresh, 
new perspective and with an elec- 
tronic availability in mind. What all 
these projects have in common is the 
need for appropriate human resources 
in the member libraries, including 
librarians and paraprofessionals with 
foreign language skills, as well as a 
thorough workflow analysis to imple- 
ment the most effective use of these 
resources. 

Selection and Acquisition: 
Differences between 
International and U.S. 
Imprints 

In addition to the language variations, 
foreign language acquisitions differ 
in many ways from domestic acquisi- 
tions, which may involve the auto- 
mated process of the post-selection 
purchasing of current U.S. imprints 
with Machine-Readable Cataloging 
(MARC) records loaded into the inte- 
grated library system (ILS). Among 
the differences are issues with par- 
ticular languages and language skills of 
staff, availability and quality of books 
and bibliographic records, variety in 
the levels of automation, and diverse 
library vendor relations. Acquisitions 
of foreign language titles have always 
been a problem, regardless of the 
subject. The complexities of selecting, 
acquiring, and cataloging foreign lan- 
guage publications can be demonstrat- 
ed by the developments in the area of 
Slavic and East European titles, where 
the methods have not changed signifi- 
cantly from the description published 
by Burger in 1990. 6 Selection and 
acquisition outside of approval plans 
remained an important way to develop 
library collections. ' 

In the case of North American 



publications, much of the selector's 
work of selecting, rejecting, referring, 
or saving titles for later selection in 
large research libraries often happens 
mostly in the vendor's online database. 
The database also provides the selec- 
tor with the history of library activi- 
ties, prevents duplication, and features 
direct connection with the module 
or interface used by the acquisitions 
staff. Selectors and acquisitions staff 
may work in the database through 
the same interface. Once the selec- 
tor's work has been finished, acquisi- 
tions staff members locate and export 
selected items in the online system 
and complete orders electronically. As 
a result, bibliographic records may be 
loaded overnight into the ILS in an 
automated process, generating pur- 
chase orders title by title. With a less 
automated method, the acquisition 
staff member locates the best source 
to purchase titles and completes the 
order electronically, which includes 
exporting or creating records. The 
possibility of batch processing (batch- 
ing items to search, order, and load 
into the ILS significantly reduces turn- 
around time) allows a smaller mar- 
gin of errors and excludes accidental 
duplication. Keeping track of orders, 
claiming, invoicing, and overriding 
duplication rules are all provided by 
the vendor's database. Books, shelf- 
ready if requested (i.e., bar-coded, 
labeled, and tattle-taped), are shipped 
within a few weeks in sturdy boxes, 
each containing a packing slip. Most 
North American vendors work with 
dependable domestic carriers and usu- 
ally also use quality shipping and fill- 
ing material, and books also are often 
shrink-wrapped individually. The 
books are received and cataloged by 
either copy or original catalogers, all 
trained to perform the given activity in 
English. Expectations toward interna- 
tional book vendors are defined by this 
type of acquisition experience. 

Similar to the process outlined 
above, selection of foreign language 
titles is conducted by librarians, 



usually subject specialists in a par- 
ticular field with appropriate language 
skills. These skills are often gained 
through studies for another advanced 
degree or the result of being raised 
in a bilingual or multilingual environ- 
ment. Selection methods, however, 
may largely rely on more conventional 
sources, such as paper slips, publishers' 
and book jobbers' print catalogs, pre- 
publication announcements, special 
offers, or informal channels. Only the 
major European vendors, such as Aux 
Amateurs de Livres International and 
Jean Touzot Librairie Internationale 
in France, Casalini Libri in Italy, Otto 
Harrassowitz in Germany, and Puvill 
Libros in Spain, offer online catalogs 
with some or all of the options of 
electronic ordering, invoicing, and 
providing MARC records. These sites 
are also available in English, and fea- 
ture similar functions to major U.S. 
vendors' password-protected sites and 
databases, such as order status and 
statements of account balances. In 
the case of other languages, the situ- 
ation is far from ideal from the North 
American buyer's perspective. 

The process of selecting and 
acquiring titles in foreign languages 
also may differ significantly by lan- 
guage, country, culture, and vendor. 
Services are heavily influenced by the 
vendor's incentive to expand in the 
North American market and to keep 
up with the advancements in technical 
services in the North American librar- 
ies. The quality of the books varies sig- 
nificantly from country to country, and 
print runs are relatively short. 8 Prices 
may also fluctuate and may incorpo- 
rate hefty bank transfer fees for small 
purchases, which makes planning and 
budgeting in a library extremely diffi- 
cult. Shipping might be unpredictable, 
delayed, or at least nontraceable, and 
duplicates are costly to return. The 
quality of packaging may range from 
the highest quality packing materials 
to soft boxes of dubious origin with 
improvised filling material. 

Ordering foreign language titles 
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requires advanced or at least inter- 
mediate language proficiency. After 
finding the sources of acquisitions, the 
staff member has to type or load a bib- 
liographic record into the ILS before 
attaching a purchase order. Acquisition 
records range from brief records (i.e., 
author, title, and minimal publication 
data) to full catalog records exported 
from a bibliographic utility. Language 
proficiency at this point is crucial for 
the subsequent steps in processing 
the book in hand. Locating the proper 
record in a bibliographic utility as 
well as loading or typing appropriate 
records into the ILS at the point of 
acquisition can be a time-consuming 
process. However, the time invested 
in the first phase is compensated by 
the smaller margin of error and instant 
availability. It also makes receiving and 
copy cataloging books feasible by a 
staff member with a lower level of for- 
eign language proficiency in the case 
of non-Roman alphabets and nearly 
no foreign language proficiency in the 
case of Roman scripts. Coordinating 
and organizing the various steps of 
selection and acquisition will largely 
depend on available language skills 
and subject expertise in a particular 
library setting. 

Cooperation as Solution: 
Vendor-Supplied 
Bibliographic Records 

Cooperative efforts have assisted librar- 
ies in many areas. Beginning in 1996, 
OCLC and the Research Libraries 
Group started to load minimal-level 
catalog records from several European 
book vendors into their respective data- 
bases. Most of the European vendor 
records in OCLC s WorldCat are from 
France, Germany, Italy, and Spain. 
Since five of the ten largest language 
groups of the world can be found in 
Europe and are responsible for 44 
percent of the world book production, 
academic libraries collect heavily from 
these countries. 9 The respective book- 
sellers in each country took the initiative 



and created records for the majority of 
the titles published in their country 
or in the language of their country. 
Two booksellers from Spain contribute 
records to WorldCat: Puvill Libros 
and Iberbook. Otto Harrassowitz con- 
tributes records for German imprints, 
Casalini Libri provides records for 
books from Italy, and Jean Touzot 
Librairie Internationale and Aux 
Amateurs de Livres supply records 
for French imprints. By 2000, vendor 
records accounted for 16.7 percent of 
Spanish books, 18 percent of French 
books, 33.6 percent of German books, 
and 52.5 percent of Italian books. 10 
The number of libraries enhancing 
vendor records in WorldCat was found 
only to be approximately one-third 
of the number of libraries contribut- 
ing original records for European lan- 
guage books. For example, 60 percent 
of records for books in Italian were 
contributed to WorldCat by a vendor, 
whereas 30 percent were contributed 
by the LC and only 10 percent by 
member libraries. 11 

Authors have raised questions 
about the value of minimal-level ven- 
dor catalog records for European lan- 
guage monographs and their effect 
on catalog department workflows 
and national cooperative cataloging 
efforts. 12 The main feature that dis- 
tinguishes the vendor records from 
other minimal-level bibliographic 
records is that the author, series, and 
subject headings on the nonvendor 
records have a much higher likelihood 
of matching the authorized form for 
these headings. In contrast, the head- 
ings on vendor records, if they are 
present at all, tend not to match the 
authoritative form. 13 

Notable initiatives were launched 
to mitigate these issues in the past 
few years. Catalogers at Casalini Libri 
have recently been trained by an 
LC representative in the LC's clas- 
sification and subject heading sys- 
tems. 14 OCLC's Enhance program 
(www.nelinet.net/oclc/cataloging/ 
enhance.htm) was established to 
encourage participant libraries to 



upgrade vendor records contributed in 
non-English languages. This program 
is designed to allow skilled catalogers 
to improve the quality of the OCLC 
WorldCat database by upgrading 
WorldCat records, primarily from less- 
than-full level to full level. The LC 
developed the MARC Record Guide 
for Monograph Aggregator Vendors, 
which was prepared by the Program 
for Cooperative Cataloging in 2006. 15 

In addition to the efforts for an 
international cooperation to rou- 
tinely provide bibliographic records 
to WorldCat, international vendor- 
supplied records serve as an example 
of sensible outsourcing. An article 
published in 2007 with an extensive 
review of recent outsourcing experi- 
ence — albeit not in an international 
context — strongly advocates an out- 
sourcing program of this kind. 16 The 
thorough evaluation of cataloging and 
physical processing supplied through 
the University of Arkansas Libraries' 
shelf-ready contract with YBP Library 
Services and PromptCat in 2005-6 
points out that the workflow is consid- 
erably shortened in most cases, and the 
shelf-ready procedures are briefer than 
those used for normal copy cataloging. 

One of the most recent initia- 
tives, WorldCat Selection Services, 
is an example of merging the best 
of collaboration and outsourcing for 
the benefit of academic and research 
libraries. Based on the software called 
the Integrated Tool for Selection 
and Ordering developed at Cornell 
University Library, it features an online 
tool to streamline and automate selec- 
tion and acquisition with the ability to 
handle foreign language titles. This 
tool integrates records from multiple 
vendors in a single system without 
differentiating domestic and foreign 
orders and records. Batch processing 
also allows loading records into the 
ILS early in the technical services pro- 
cess. 1 ' In another collaborative initia- 
tive, the LC, Bibliotheque nationale de 
France, Deutsche Nationalbibliothek, 
and OCLC have signed a memoran- 
dum of understanding to extend and 
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enhance the Virtual International 
Authority File, a project that virtu- 
ally combines multiple name author- 
ity files into a single name authority 
service. 18 

Foreign vendor records are valu- 
able from a broader perspective. Not 
only are they often more accurate than 
the ad hoc brief record keyed in from 
an unverified source by a staff mem- 
ber lacking the sufficient language 
skills, but records in WorldCat often 
help verify titles and availability. For 
the acquisitions workflow, minimal- 
level vendor records can mean greater 
accuracy and faster turnaround. A 
significantly lower level of language 
proficiency is necessary to identify 
and process these titles. For the local 
library user, these records provide 
access to the information at the earli- 
est point of acquisition. 

A Case Study: Acquisitions 
of Foreign Language Titles at 
Rutgers University Libraries 

The curriculum at Rutgers, the State 
University of New Jersey, relies heav- 
ily on foreign languages both at the 
undergraduate and graduate levels on 
all three campuses. In addition to 
the Master of Arts, Master of Arts in 
Teaching, and PhD programs in sever- 
al languages offered by the respective 
departments, Rutgers also operates the 
World Languages Institute, a major 
outreach graduate-level program for 
K-12 world language teachers, includ- 
ing the K-12 Chinese Teacher Training 
Initiative. As a state university, it is also 
a repository for domestic and foreign 
government documents. 

During the past five years, Rutgers 
University Libraries (RUL) used vari- 
ous processes. In 2002, the forward- 
looking management at the Technical 
and Automated Services created a 
new staff position to focus on for- 
eign language acquisitions. This posi- 
tion required language proficiency in 
at least three European languages, 
including Russian. The position was 



filled from 2003 to 2007 by a for- 
mer linguist and language instructor 
with advanced-level proficiency in the 
major European languages, including 
Russian, and bibliographic proficiency 
in many more. 

Types of acquisition resources 
vary from language to language and 
from country to country. The sce- 
nario closest to the established RUL 
practice with current North American 
imprints is the selection and acquisi- 
tion of titles in European languag- 
es, while RUL had many difficulties 
with Chinese, Japanese, and Korean 
(CJK) orders for a long time. The 
CJK workflow changed several times 
during these five years, corresponding 
to the changes in human resources. 
RUL made significant efforts to blend 
CJK workflow into the library routine, 
including selection, ordering, shipping, 
and tracing books as well as invoice 
processing. The recently established 
South Asian Studies Program means 
new challenges. 

Titles in European languages are 
typically selected by several subject 
specialists, such as the world history 
librarian, music librarian, women's 
studies librarian, and art and humani- 
ties librarians, then ordered by the 
same acquisitions staff member. 
German, French, and Italian titles 
have proven to be the easiest to select 
and order because of the online sys- 
tems of the respective vendors. These 
vendors closely follow the trends of 
the North American market, are pres- 
ent at major North American confer- 
ences, and tend to meet the general 
expectations in the United States in 
terms of electronic notification, dupli- 
cation, online catalogs, and ordering, 
as well as invoicing, claiming, ship- 
ping, returns, and rush orders. 

Spanish and Portuguese Language 
Acquisitions 

A large number of the foreign lan- 
guage titles at RUL are monographs 
in Spanish and Portuguese because 
of the needs of one of the largest 



language departments at the univer- 
sity. The Department of Spanish and 
Portuguese offers both undergradu- 
ate and graduate courses in Hispanic 
literature, culture, and civilization as 
well as courses in Spanish linguistics, 
literary theory, Luso-Brazilian litera- 
ture and culture, translation, inter- 
pretation, and teaching and research 
methodology. The Spanish-English 
bilingual selector had an incredibly 
complex and demanding job to meet 
user needs and ordering deadlines 
imposed by the fiscal year. Titles were 
selected from the broadest sources 
ranging from formal to informal, such 
as booksellers' print and online noti- 
fications, vendors' websites, titles or 
booklists sent to her by antiquarians or 
friends, prepublication notes and spe- 
cial offers, and personal e-mail mes- 
sages and communications, to mention 
a few. Consequently, the acquisitions 
department's task was relatively easy 
for the majority of the titles in terms 
of verifying the availability with ven- 
dors or conducting a preorder search 
in the local catalog. Unconventional 
selection methods, however, tested the 
resourcefulness of the acquisition staff 
occasionally. 

The first serious challenge in for- 
eign language acquisitions is to locate 
quality bibliographic records to load 
into the ILS. Using OCLC Connexion 
(previously OCLC CatME) software 
does not always bring the desired 
results. Because many Latin American 
and European titles might be reprints 
with or without the same ISBN, indi- 
vidual title search has proven success- 
ful in most cases. A title -by-title search 
instead of a batch ISBN search sig- 
nificantly slowed down the workflow. 
These records, many of them provided 
by vendors, could be uploaded to the 
ILS as a batch simultaneously with the 
English titles. 

The next challenge for the order- 
ing staff member is to find the recently 
loaded record in the local system so 
the purchase order could be attached 
to it (i.e., one order to one title as 
stipulated by the strict auditing rules 
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as well as for convenient follow-up 
with receiving or claiming). Again, the 
failure with the simple ISBN searches 
in the local system does not mean the 
title is not there. In RUL's experience, 
typing in a foreign language inadver- 
tently slows down the process and 
also increases the potential for typos. 
The lack of the appropriate language 
proficiency seems to have coincided 
with a larger number of errors when 
creating brief records for these titles 
instead of working with minimal-level 
vendor records, no matter how embry- 
onic they are. 

Ordering in European Languages: 
French, Italian, and German 

Acquisition of other titles in foreign lan- 
guages with Roman scripts effortlessly 
followed the same workflow after the 
pilot of about three thousand orders 
of Spanish and Portuguese language 
titles in 2003. Because of budget allo- 
cations, monographs written in lesser- 
taught languages such as Romanian, 
Hungarian, and Polish were ordered 
only occasionally and required indi- 
vidual attention. However, frequent- 
ly requested French, German, and 
Italian titles were easy to order after 
the selection, which mostly happened 
from vendor-supplied data, either print 
or electronic. French titles at RUL are 
selected by an art librarian (including 
topics in French literature, culture, 
and language) and by the women's 
studies or music librarian for their 
respective topics from slips by Aux 
Amateurs de Livres and Jean Touzot 
Librairie Internationale. Both vendors 
provide records to OCLC, the major 
source of records exports for RUL. 
The French selector also started to use 
electronic slips recently, and services 
that these two vendors offer online are 
yet to be explored. 

Italian titles are selected by 
another art librarian and purchased 
from Casalini Libri, the vendor for 
titles published in Italy or written in 
Italian. Casalini features integrated 
library services with its monograph 



and serial databases (i libri and le 
riviste) to support selection and 
acquisitions. The database offers a 
search platform with advanced search 
options, such as bibliographic fields, 
subject fields by call number types, 
and others complemented by further 
optional fields such as interdisciplinary 
topics, level of audience, book types, 
and special formats. It is also possible 
to narrow the search to titles recently 
profiled by Casalini Libri or by price 
as well as titles on order. Selectors 
can set up their own profiles, while 
acquisitions staff have access to order- 
related information and can order 
online. Similar to major North 
American vendors, Casalini offers 
options to create a selection list and 
save titles including local order data, 
receive title information and selection 
lists automatically via e-mail, download 
bibliographical records in MARC for- 
mat, and place orders online. Casalini 
recently expanded the coverage to 
supply publications in other European 
languages. Casalini Libri is an excel- 
lent example of a vendor that offers 
a large variety of services and allows 
libraries to take advantage of them to 
the degree that best suits their needs. 
The RUL acquisitions staff ordered 
from paper notification slips and used 
the site only occasionally to verify 
availability and monitor order status. 
These slips contain the necessary 
bibliographic information; therefore 
ordering does not require more than 
a bibliographic proficiency of Italian. 
The same level is sufficient for receiv- 
ing and copy cataloging. After the new 
foreign language acquisition workflow 
was implemented for Italian titles, 
Rutgers had no problems or returns. 

German titles are selected by 
the world history librarian, who has 
advanced-level language proficiency. 
The vendor, Otto Harrassowitz, has an 
online database called OttoEditions, 
which has been widely used for both 
selection and acquisition. Technically, 
online ordering is available for the titles 
selected by the librarian. However, 
for workflow and human resources 



reasons, it is easier for RUL to use 
this service for rush orders only and to 
place regular orders according to the 
workflow streamlined with Spanish 
and Portuguese titles. Nevertheless, 
high efficiency applies to the acquisi- 
tion of German tides selected on the 
vendors site or from paper slips and 
ordered both electronically and in tra- 
ditional ways. Harrassowitz targets the 
academic and research library com- 
munity by specializing in the distribu- 
tion of scholarly books, periodicals, 
e-resources, and music scores. This 
vendor consolidates all German and 
some European bookselling in period- 
ical subscriptions, databases, standing 
orders, and music scores. OttoSerials, 
Harrassowitz's online management 
system for periodicals and standing 
orders, simplifies the handling of print 
and electronic subscriptions for librar- 
ies. They can also provide MARC 
records, while (as a new service to 
facilitate pre-order searching) any 
OttoEditions account can be set up 
to automatically search the library's 
online catalog from OttoEditions with 
certain fields appearing as links in 
the OttoEditions bibliographic display. 
Clicking on any of these links initiates 
a search in the library's online catalog, 
allowing one to determine quickly if a 
library already holds that item. 

Acquisitions in Non-Roman Scripts 

With four members proficient in 
Russian in RUL Technical and 
Automated Services, the limited num- 
ber of Cyrillic orders has never been a 
problem. Neither have titles in Hebrew 
or Arabic, with a native speaker at 
the cataloging department. However, 
a part-time East Asian librarian and 
the foreign language acquisitions staff 
member were experimenting with an 
array of workflows for CJK orders dur- 
ing a four-year period. Typically, the 
librarian (a native speaker of Chinese) 
completed selection and communi- 
cated with the vendors located either 
in the United States or overseas, while 
the acquisitions staff member placed a 
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single batch order attached to a shad- 
ow record (i.e., a bibliographic record 
invisible to patrons). This record 
contained codes that only these two 
people were able to decipher at any 
given time. Another part-time staff 
member at the cataloging department, 
a Korean native with strong reading 
skills in Chinese and Japanese, helped 
the acquisitions staff with receiving. 
Because of the nature of CJK resourc- 
es, books ordered at the same time 
were rarely shipped in one package, 
and it was very difficult to keep track 
of orders. When the order request was 
filled in several shipments, invoicing 
became confusing even with excellent 
records upon receipt. Toward the end 
of the fiscal year, encumbered funds 
and orders to be rolled over to the next 
fiscal year became confusing. Keeping 
track of a batch of fifty titles or more 
delivered in four or more shipments 
over a period of a few weeks was near- 
ly impossible, and RUL completely 
overlooked the problems that drop 
shipments (i.e., when orders were 
completed by an overseas publisher 
upon the initiation of the vendor) 
were bound to cause. Even with sev- 
eral adjustments to address the lack 
of the necessary language skills at 
the point of acquisition, the workflow 
looked like a total disaster from the 
budgeting and invoicing points of view 
despite the collaboration and RUL's 
best intentions. The light at the end 
of the tunnel showed only when a full- 
time CJK cataloger librarian was hired 
with the language skills required for 
this complex task. 

Lessons Learned 

RUL's experience of ordering a large 
number of foreign language titles from 
a variety of resources in a period 
of nearly five years may be helpful 
for technical services departments of 
other libraries that need to explore 
foreign acquisitions workflows. With 
diminishing resources available, aca- 
demic libraries face the challenges of 



designing more effective workflows 
and may want to experiment with 
stronger collaboration of librarians, 
technical services staff, and vendors. 

The Effect of Foreign Acquisitions on 
Workflow and Staffing 

In comparison to English language 
acquisitions, the workflow to acquire 
foreign language titles is typically char- 
acterized by less automation, longer 
processing times, and more frequent 
human intervention. Benchmarks in 
technical services, usually based on 
English language titles, do not work 
with foreign language titles. Problem 
orders with U.S. imprints, including 
rush orders, out-of-print books, and 
special and rare formats and titles 
frequently have separate workflows 
in the acquisitions department's daily 
practice. Additionally, foreign orders 
may have distinct workflows for each 
language, depending on language 
expertise, selection resources, and the 
availability of titles and bibliographic 
records. Occasionally special foreign 
orders, such as audiovisual materials 
or computer files, may mean an even 
greater challenge with their compat- 
ibility issues. 

The acquisitions department at 
RUL did not have to deal with extreme 
difficulties of staff turnover during 
these years. Cross-training allowed 
scheduling staff members according to 
demands. There were noticeable dif- 
ferences in the workload in the differ- 
ent phases of the fiscal year, including 
peak times of ordering followed by an 
overwhelming number of shipments 
to be processed. Making the best use 
of staff time entailed assigning the 
appropriate level and number of staff 
to handle the particular demands at 
any given time. One example is open- 
ing boxes shipped to the library, an 
assignment requiring minimal foreign 
language skills. In practice, the books 
are placed on book carts in the same 
order as they are listed on the packing 
list. If done correctly, usually by part- 
time student employees with some 



foreign language skills, staff members 
who usually work with English books 
can match books in hand to the pack- 
ing slips and can quickly receive and 
copy catalog titles in any foreign lan- 
guage with Roman scripts. 

Clear expectations at each step 
of the foreign language acquisitions 
process as well as extensive docu- 
mentation understandable at each 
level seem to have helped eliminate 
anxiety and frustration, an argument 
often mentioned by staff members 
in the context of foreign language 
acquisitions. Creating a department- 
wide skills inventory and document- 
ing work processes allowed a more 
sensible reorganization of the work- 
flow. Simple but creative solutions, 
such as a problem shelf established 
at a central location in the techni- 
cal services building, proved to be 
helpful. Once staff members realized 
that they were not supposed to go 
beyond their responsibilities or aban- 
don their routines for foreign titles, 
they did not become disappointed so 
easily. Neither did they distract their 
coworker in charge of troubleshooting 
by interrupting them with every single 
book — they simply placed problematic 
items on the shelf for later review. The 
shelf also made it evident and observ- 
able that other staff members had 
problems, too. Troubleshooting was 
completed by the same staff mem- 
ber who made the first step in the 
workflow, which allowed a deeper 
insight into the nature of the issues. 
In the long run, this practice yield- 
ed a thorough analysis of errors and 
resulted in the elimination of many of 
them. Handling problems in batches 
revealed patterns and intrinsically sug- 
gested solutions. Assigning one person 
to solve problems also avoided the 
duplication of efforts in this area. The 
optimal workflow, streamlined with 
Spanish and Portuguese titles, allowed 
the foreign language acquisitions staff 
member to dedicate her time entirely 
to language-specific tasks because the 
number of errors and returns dramati- 
cally dropped to less than 3 percent. 
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Vendor-supplied bibliographic records 
(available in the Research Libraries 
Group's union catalog at that time and 
from OCLC later) have played a role in 
reviewing and reorganizing the work- 
flow, making Spanish and Portuguese 
language acquisitions more efficient. 

RUL's case shows that with proac- 
tive management, consolidating for- 
eign language acquisition workflows 
and integrating the process into the 
daily routine in a large academic 
library setting is feasible. The redesign 
of the foreign language workflow was 
endorsed by the head and assistant 
head of the acquisitions department, 
and the supportive environment pro- 
vided by Technical and Automated 
Services also greatly facilitated chang- 
es by encouraging resource sharing, 
cooperation, and training. Another 
factor worth mentioning was revamp- 
ing intra- and interdepartmental com- 
munication, such as consolidating staff 
notes and records about problems 
or regularly inviting acquisitions staff 
members to cataloging group meet- 
ings. Identifying the components of 
the foreign language acquisition pro- 
cess from a larger perspective worked 
for the Technical and Automated 
Services staff members and encour- 
aged them to find their role in accom- 
plishing departmental goals. Foreign 
titles have proven to build bridges 
in many ways within and outside the 
library. 

Language Proficiency Needed 

As far as particular foreign language 
skills are concerned, the acquisitions 
department at RUL had the most dif- 
ficulty with CJK orders. The issues 
originated when selection, acquisition, 
and end-processing happened at the 
East Asian Library, where the librarian 
and the staff had the proper language 
expertise to select and process the 
books in the library, but separately 
from the Technical and Automated 
Services workflow. However, invoic- 
ing and claiming — and especially 



following up with problems — caused 
a significant challenge for the acqui- 
sitions department and the budget 
office. After the retirement of the East 
Asian librarian, several attempts were 
made to overcome residual issues 
and integrate CJK ordering workflow 
into the general acquisition process in 
light of the successful workflow with 
other languages. Without the neces- 
sary language skills at the acquisi- 
tions department, CJK orders became 
nightmarish. Japanese titles were the 
only exceptions; these were selected 
by a librarian with the appropriate 
language skills and e-mailed to both 
the vendor and the acquisitions staff, 
the latter placing a batch order. The 
success was the result of the excel- 
lent communication and strong rela- 
tionship with a reputable vendor. A 
willingness to comply with the expec- 
tations at North American libraries is 
remarkable in the case of all interna- 
tional vendors. Without the high level 
of vendor cooperation, completing 
many of these orders would have been 
nearly impossible. 

The CJK acquisitions experience 
highlights the necessity of bibliograph- 
ic proficiency of staff members at 
the acquisitions department. Russian 
and other Cyrillic scripts presented 
no problems because more than one 
librarian and more than one acqui- 
sition and cataloging staff member 
had these language skills ranging from 
basic reading skills, such as the ability 
to verify whether the book in hand 
matches the book ordered, to more 
advanced, such as strong reading skills 
or near-native proficiency. Regarding 
the level of language skills required 
at an acquisitions department, basic 
language proficiency seems to be 
working, provided the selector and 
the cataloger at previous and subse- 
quent points in the process are able to 
provide support. Although linguistic 
abilities are frequently listed within 
cultural competencies, library schools 
do not have any language requirement 
nor do they offer special training for 



language-related librarian positions or 
copious language-related tasks in a 
library in public and technical ser- 
vices. 

The Spanish and Portuguese 
acquisition workflow at RUL can serve 
as a prime example that, at the ini- 
tial point of acquisition, the language 
proficiency of the staff member who 
locates records and orders the titles 
is crucial for the subsequent steps. 
Loading or keying in a proper record 
is vital to avoid duplication and costly 
returns, and also enables staff mem- 
bers with no or limited Spanish lan- 
guage proficiency to match the book 
in hand to the order and invoice upon 
receipt, which results in a significant 
increase of efficiency and a dramatic 
decrease in time spent troubleshoot- 
ing. In comparison with the workflow 
of current U.S. imprints, verifying the 
availability of proper records and iden- 
tifying them are the responsibilities of 
an experienced staff member, whereas 
the fairly routine part of receiving is 
done by employees in clerical posi- 
tions. The term proper record may not 
be satisfactory for cataloging purposes 
and is used here to denote a variety 
of bibliographic records with several 
access points for the user while com- 
plying with the minimum require- 
ments of an acquisition record. 

Collaboration: Selectors, 
Acquisitions Staff, and Vendors 

The acquisition of foreign language 
titles has been considered a central 
issue at RUL. Identifying barriers and 
bottlenecks in the workflow of foreign 
language acquisitions from a broader 
perspective allowed RUL to learn and 
encouraged RUL to experiment with 
various methods of collaboration using 
a variety of communication strategies. 
In the case of many languages, the col- 
laboration among all participants (i.e., 
vendor, acquisitions staff, cataloging 
staff, and selector) resulted in sig- 
nificantly faster turnaround times and 
fewer mistakes. Vendor performance 
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was assessed and continuously moni- 
tored, which led to the creation of an 
in-house vendor database based on the 
selection sources. Improved commu- 
nication also ensured that all partici- 
pants of the acquisitions process were 
on the same page concerning the level 
of automation in a particular language. 
As pointed out before, the workflow 
of ordering Russian language titles 
was seamless for two main reasons: 
sufficient language proficiency at all 
points and the excellent collaboration 
between the vendors and the library, 
including the selector, acquisitions and 
cataloging staff, and the library budget 
office. 

Whenever available, vendor- 
supplied records were used for foreign 
language titles acquisitions to verify 
data and availability. In most cases, the 
advantages of having an easily available 
and mostly correct record balanced 
the inaccuracies of the record. The 
acquisitions record served as a source 
to keep track of orders both by the 
acquisitions staff and the selectors, as a 
starting point for catalogers to upgrade 
the record, and as an access point for 
reference librarians and library users 
while the title was still at an early 
phase in Technical and Automated 
Services. Few libraries can afford to 
hire librarians and staff members with 
extensive skills in every language at all 
service points and in all departments, 
but collaboration is available for all. 

The Foreign Language 
Acquisitions Experience 

In comparison to North American 
acquisitions, the foreign language ven- 
dor experience is not better or worse, 
just different. Favorable experienc- 
es far outweigh the complications. 
Troubleshooting is considered one of 
the main criteria to evaluate vendor 
services, and most foreign language 
vendors match to or even surpass 
domestic vendors in this area. Not 
only do they respond to problems in 



a timely manner with creative solu- 
tions, but they also demonstrate a 
noticeably high level of responsibility 
and accountability. Another important 
criterion of evaluation, handling order 
requests out of the ordinary workflow, 
can also be pointed out as one of 
the strongest points of many foreign 
language vendors. Foreign language 
acquisitions are full of difficulties. The 
challenge is to be resourceful and 
always ready to overcome linguistic, 
cultural, physical, and communication 
barriers. 

References 

1. "Acquisition and Distribution of 
Foreign Language and Area Studies 
Materials," Journal of Library 
Administration 29, no. 3/4 (2000): 
51-75; "Research Libraries in a Global 
Context: An Exploratory Paper," 
Journal of Library Administration 29, 
no. 3 (2000): 77-91. 

2. "Research Libraries in a Global 
Context," 83. 

3. "Acquisition and Distribution of 
Foreign Language and Areas Studies 
Materials." 

4. "Research Libraries in a Global 
Context." 

5. Patricia Brennan and Jutta Reed- 
Scott, Cooperative Strategies in 
Foreign Acquisitions, SPEC Flyer no. 
195 (Washington, D.C.: Association 
of Research Libraries, Office of 
Management Services, 1993). 

6. Robert Burger, "Slavic Technical 
Services," Technical Services Today 
and Tomorrow, ed. Michael Gorman 
(Littleton, Colo.: Libraries Unlimited, 
1990): 130-41. 

7. Keren Dali and Juris Dilevko, "Beyond 
Approval Plans: Methods of Selection 
and Acquisition of Books in Slavic 
and East European Languages in 
North American Libraries," Library 
Collections, Acquisitions, ir Technical 
Services 29, no. 3 (Sept. 2005): 238- 
69. 

8. Sue Henczel, "Selecting and Acquiring 
Library Materials in Languages Other 
than English: Establishing Non- 
English Collections for Public, School 
and Academic Libraries," Collection 
Building 22, no. 3 (2003): 141-45. 



9. Knut Dorn, "Champagne Taste 
and a Beer Budget: The Problem 
of Increasing Scholarly Publishing in 
Europe and Decreasing Academic 
Library Budgets in North America" 
(paper presented at the annual 
meeting of the Center for Research 
Libraries, Chicago, Apr. 24, 1992). 

10. Charlene Kellsey, "Cooperative 
Cataloging, Vendor Records, and 
European Language Monographs," 
Library Resources & Technical 
Services 46, no. 3 (July 2002): 105- 
10. 

11. Charlene Kellsey, "Trends in Source 
of Catalog Records for European 
Monographs 1996-2000," Library 
Resources 6- Technical Services 45, 
no. 3 (July 2001): 123-26. 

12. Laura D. Shedenhelm and Bartley A. 
Burk, "Book Vendor Records in the 
OCLC Database: Boon or Bane?" 
Library Resources & Technical 
Services 45, no. 1 (Jan. 2001): 10-19. 

13. Jeffrey Beall, "The Impact of Vendor 
Records on Cataloging and Access 
in Academic Libraries," Library 
Collections, Acquisitions, ir Technical 
Services 24, no. 2 (Summer 2000): 
229-37. 

14. Kellsey, "Trends in Source of Catalog 
Records." 

15. Library of Congress, Program for 
Cooperative Cataloging, MARC 
Record Guide for Monograph 
Aggregator Vendors, www.loc.gov/ 
catdir/pcc/sca/FinalVendorGuide.pdf 
(accessed Mar. 19, 2008). 

16. Mary Walker and Deb Kulczak, 
"Shelf-Ready Books using PromptCat 
and YBP: Issues to Consider," Library 
Collections, Acquisitions, (r Technical 
Services 31, no. 2 (June 2007): 
61-84. 

17. OCLC, "WorldCat Selection Service," 
webinar presented Feb. 2007, www5 
. oclc . org/downloads/wcsp/ default . htm 
(accessed Mar. 19, 2008). 

18. Library of Congress, "News from 
the Library of Congress: Library of 
Congress, Bibliotheque nationale de 
France, Deutsche Nationalbibliothek 
and OCLC Enhance VIAF Project" 
press release, Nov. 16, 2007, www 
.loc.gov/today/pr/2007/07-236.html 
(accessed Mar. 19, 2008). 



94 



LRTS 53(2) 



Notes on Operations 

Creating Organization Name 
Authority within an Electronic 
Resources Management System 

By Kristen Blake and Jacquie Samples 



Staff members at North Carolina State University (NCSU) Libraries have identi- 
fied the need for name authority control within E-Matrix, a locally developed 
electronic resources management (ERM) system, to support collection intelligence, 
the process of collecting, collocating, and analyzing data associated with a collec- 
tion to gain a sophisticated understanding of its qualities for strategic planning 
and decision making. This paper examines the value of establishing authority 
control over organization names within an ERM system in addition to describing 
NCSU's design for conducting name authority work in E-Matrix. A discussion 
of the creation of a name authority tool within E-Matrix is provided along with 
illustrations and examples of workflow design and implementation for the assign- 
ment of authoritative headings. Current practices related to authority control and 
ERM systems in academic libraries and within organizations such as the Online 
Computer Library Center (OCLC) are also investigated and summarized to pro- 
vide context for this project. Future possibilities for the use of this type of author- 
ity control on the part of librarians, vendors, and standards bodies are explored. 
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As electronic resources management (ERM) systems become more advanced 
and their use more widespread, libraries have begun to consider the potential 
of these systems to aid in collections decisions by performing advanced data anal- 
ysis functions. Name authority control is of critical importance if ERM systems 
are to be put to this use because information drawn into a system from different 
sources must be collocated to produce accurate and useful analyses and reports. 
Throughout the development of E-Matrix, a homegrown ERM system, North 
Carolina State University (NCSU) Libraries has focused on the application's 
potential to facilitate effective collection intelligence, the process of collecting, 
collocating, and analyzing data associated with a collection to gain a sophisticated 
understanding of its qualities in order to strategically plan and make decisions. 
E-Matrix centralizes information from the library's catalog, link resolver, and 
assorted flat files within a single database and can perform analysis functions 
that include data from all of these sources. A challenge presented by this process 
is the identification and collocation of data elements imported to E-Matrix in a 
multiplicity of uncontrolled formats. 

Data about organizations, such as the names of publishers, vendors, provid- 
ers, and licensors of serials and electronic resources, has been the most difficult 
element to normalize within E-Matrix. Because organization names are imported 
from unformatted fields created for outside applications, the data in E-Matrix nat- 
urally lacks consistency. The names of organizations appear in dozens of variant 
and erroneous forms, with neither any indication of connections between entities 
that indicate business relationships nor authorized forms of names. In aiming to 
use E-matrix as a sophisticated reporting and collection intelligence tool, NCSU 
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Libraries came to the conclusion that 
the application must apply authority 
control to its organization name data 
to correct these inherent irregularities. 
Following that decision, library staff 
members have implemented a project 
to create a set of singular authorized 
headings to control and normalize 
the organization name data stored in 
E-Matrix. 

This paper reports on the pro- 
cess of creating authoritative data for 
organization names within E-Matrix 
at NCSU Libraries. The discussion 
begins with a brief literature review 
and an analysis of how libraries and 
library organizations have been using 
electronic resource management sys- 
tems to manage organization names. 
It then describes the planning and 
implementation of an organization 
name authority at NCSU Libraries. 
The paper concludes with an analysis 
of future possibilities for data use and 
control within ERM systems. 

Literature Review 

Over the past decade, ERM systems 
have emerged as the accepted tool for 
storing and managing complex data 
about serial and electronic resources, 
including information about the orga- 
nizations that publish, sell, and host 
those resources. As early as 2004, the 
Digital Library Federation's Electronic 
Resource Management Initiative 
(DLF ERMI) supported the use of 
ERM systems for tracking organiza- 
tion data, defining them, in part, as 
tools that would centralize data from 
disparate areas of large libraries to 
aid in the selection and evaluation 
of electronic resources. 1 The DLF 
ERMI report stresses the need to 
define standards and best practices for 
data elements stored within an ERM 
system, but does not delve into these 
features on a practical level. 

Since the publication of the DLF 
ERMI report, no further literature 
has been published that examines 



conceptually or practically the work 
needed to establish a data collection 
and analysis function within an ERM. 
The data management facet of ERM 
functionality has largely been eclipsed 
in practice by the more urgent need 
to implement licensing and workflow 
functions. Recent articles on ERM 
systems tend to mention data collec- 
tion and reporting only briefly and as a 
corollary to broader processes, such as 
connecting support staff with needed 
management information or creating 
general tools for bibliographers. 2 Most 
often, the challenges of using the data 
collection functions of ERM systems 
are simply assigned to the realm of the 
future. While the need for ERM tools 
to facilitate collection intelligence and 
reporting remains in the professional 
consciousness, it has not yet been 
explored in a meaningful way. 

Despite the lack of targeted dis- 
cussion about the role of data collec- 
tion within ERM systems, the general 
process of creating useful administra- 
tive metadata has been touched upon 
in other contexts and proves useful 
here. Hawthorne, as well as the DLF 
ERMI report, addresses the need for 
standards to avoid labor duplication 
within and between libraries working 
with ERM systems, but advises staying 
focused on broader design principles 
rather than addressing the nature of 
the data that will be collected and 
manipulated. 3 Gorman, while not 
specifically addressing ERM systems, 
adds a layer of insight to the equa- 
tion when he discusses the need for 
all metadata content to be subject to 
the same stringent requirements as 
bibliographic content. What use is a 
set of standardized fields when the 
data within those fields can vary so 
broadly? To be truly useful as a tool 
for access and collocation, he argues, 
the content of metadata fields must be 
subject to some level of authority con- 
trol. 4 The application of this concept to 
ERM data presents strong support for 
NCSU Libraries' decision incorporate 
a name authority into E-Matrix. 



Current Practices in Authority 
Control and ERM Systems 

To provide context for NCSU 
Libraries' organization name author- 
ity project, an informal survey of 
academic libraries known to have 
begun ERM system implementation 
was conducted to gauge the use of 
organization name data within these 
systems. Telephone interviews were 
conducted in October and November 
of 2008 with nine professionals from 
nine institutions, including Patrick 
Carr from Mississippi State University; 
Jill Emery from the University of 
Texas-Austin; Diane Graver from the 
University of Washington; Patricia 
Martin from the California Digital 
Library; Kim Maxwell from the 
Massachusetts Institute of Technology; 
Ophelia Payne from the University of 
Virginia; Clara Ruttenberg from Johns 
Hopkins University; Barbara Weir 
from Swarthmore College on behalf 
of the Tri-College Consortium; and 
Paoshan Yue from the University of 
Nevada-Reno. These discussions iden- 
tified the duration and extent of each 
library's experience with its ERM; the 
functions each library supported, or 
planned to support, with its ERM; and 
the uses, if any, each library had found 
for data related to organizations. 

The survey revealed that these 
libraries cover a wide spectrum in 
the extent of their ERM develop- 
ment. Of the nine libraries contacted, 
eight owned an ERM system, and 
one was in the process of evaluating 
a system for purchase after reject- 
ing a previously purchased product. 
Six owned commercial systems, and 
two were transitioning from home- 
grown systems to commercial prod- 
ucts. The products represented by 
the surveyed libraries included Verde 
by Ex Libris, Electronic Resources 
Management by Innovative Interfaces, 
and 360 Resource Manager by Serials 
Solutions. MIT was the most experi- 
enced ERM library, having created the 
homegrown system Vera in 1999 as a 
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FileMakerPro database. The four least 
experienced libraries had owned their 
current ERM systems for less than 
one year. Of the eight libraries that 
owned ERMs, four still considered 
themselves to be in the implementa- 
tion stage. The other four considered 
their ERM systems functional, but 
indicated that they were still adding 
new features and hardly considered 
their systems done. 

The functions these libraries sup- 
ported with their ERMs were as var- 
ied as their stages of development. 
Librarians reported using their systems 
for managing licenses, maintaining 
holdings data, storing contact infor- 
mation for customer representatives, 
tracking orders, managing workflow, 
generating usage statistics, batchload- 
ing e-journal metadata to an online 
public access catalog, and storing infor- 
mation about product trials. The librar- 
ies that only recently acquired their 
ERMs tended to have implemented 
only one or two of these functions, 
primarily in the areas of licensing and 
holdings data, while the more experi- 
enced libraries had branched out into 
additional functions. Grover, electronic 
resources coordinator at the University 
of Washington, which has been using 
Innovative Interfaces' Electronic 
Resources Management since 2003, 
named at least a half dozen creative 
ERM functions under development 
at her library, including database 
and e-book management, Internet 
Protocol address range tracking, and 
SUSHI-compliant usage data feeds. 5 
(SUSHI stands for the Standardized 
Usage Statistics Harvesting Initiative 
(SUSHI) Protocol Standard, which 
defines an automated request and 
response model for the harvesting of 
electronic resource usage data through 
a Web services framework.) 6 

While an authority file of orga- 
nization names could be beneficial 
to several of the above functions — 
specifically producing usage reports 
and storing publisher contact informa- 
tion — most of the libraries surveyed 
had not made use of an organization 



name authority within their ERM sys- 
tems. Of the librarians surveyed, five 
reported that their institutions were 
not yet far enough along in the imple- 
mentation process to give the idea seri- 
ous consideration. Martin, director of 
bibliographic services at the California 
Digital Library (CDL), which was still 
exploring new options for ERM sys- 
tems at the time of the interview, 
said CDLs implementation team had 
expressed a desire for greater organi- 
zation name control and would proba- 
bly examine the situation more closely 
once it had selected a product. ' Weir, 
of Swarthmore College, said that the 
Tri-College Consortium (Swarthmore, 
Bryn Mawr, and Haverford colleg- 
es) was still involved in the basics 
of implementing Verde's workflow 
features and had not yet gotten to 
the point where it could think about 
reporting functions. 8 All of the librar- 
ians who had not yet considered the 
use of organization names within their 
ERM systems acknowledged that the 
practice could be useful at some point 
in the future. 

Three librarians said that their 
institutions had given some level 
of consideration to the problem of 
organization name authority and had 
decided that a solution was not nec- 
essary at the present time. Grover 
said the University of Washington 
Libraries had looked at using fixed 
fields within Innovative's system to 
differentiate between publishers and 
access providers. Ultimately, the staff 
decided that control of organization 
names could be useful, but didn't war- 
rant an elaborate solution at the time. 9 
Emery echoed that perspective, say- 
ing that University of Texas-Austin's 
primary focuses for its ERM system 
were public access and workflow func- 
tions, not descriptive metadata. 10 Carr, 
serials coordinator at Mississippi State 
University, said his institution has not 
needed the functionality of a name 
authority because it primarily uses 
organization names within its ERM 
system to store contact information 
for customer representatives, and 



organization names supplied by Serials 
Solutions so far have been sufficient to 
support that function. 11 

Of the librarians contacted, only 
Maxwell, serials acquisitions librar- 
ian and associate head of acquisitions 
and licensing services at MIT, said 
that her library had developed a fully 
realized solution for tracking organiza- 
tion names through its ERM system. 12 
Maxwell described that solution, 
which dates back to the late 1980s. At 
that time, a librarian at MIT created a 
local database called Commitments to 
track serials pricing by publisher. As 
each new publisher was added to the 
list, an authoritative name was decided 
upon and maintained with each new 
entry. Relationships between publish- 
ers were also tracked as companies 
were bought and sold. As the needs 
of the library regarding electronic 
resources evolved, MIT integrated 
Commitments into Vera, its home- 
grown ERM system, and has kept the 
list up-to-date over the years. In 2008, 
MIT planned to abandon Vera in favor 
of Ex Libris's Verde system. Maxwell 
said that Verde would not have the 
same organization name capabilities 
as Vera, and would rely instead on a 
central knowledgebase maintained by 
Ex Libris. Maxwell anticipated the 
maintenance of publisher name data 
in Verde to be different and more con- 
fusing than the MIT's current system 
and said MIT will likely rely on the 
existing Commitments database until a 
new solution can be developed. 

Speaking with colleagues in the 
academic library profession allowed 
a number of useful conclusions to be 
drawn about the role of organization 
name data in ERM systems. First, 
the control and manipulation of orga- 
nization name data is not a project 
that many libraries have considered, 
often because of a lack of resources 
or expertise in the ERM implemen- 
tation process. Second, organization 
name control holds varying degrees of 
importance for libraries. While some 
institutions may see it as an impor- 
tant component of their reporting and 
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evaluation practices, others regard it as 
optional or outside their focus. Finally, 
based on MIT's example and NSCU 
Libraries' own experience, organiza- 
tion name control is an issue that tends 
to be an enterprise venture originat- 
ing in the library, since commercial 
systems do not usually facilitate the 
creation or maintenance of that data. 

OCLC as a Source for 
Organization Name Authority 

In addition to investigating the roles 
that name authority has played in elec- 
tronic resources management in the 
academic library field, NCSU Libraries 
also sought context for its project from 
the examples of two recent authority- 
based initiatives originating at the 
OCLC Online Computer Library 
Center (OCLC). The WorldCat 
Registry (http://oclc.org/registry), a 
directory of institutional data about 
libraries and consortia and the services 
they provide, aims to function as a 
global authority file and may have value 
as a source for authoritative data about 
the institutions with which libraries 
interact. OCLC's publisher name serv- 
er (http://oclc.org/research/projects/ 
publisherns) is a research project to 
build a service that will normalize pub- 
lisher names and provide users with 
other relevant metadata. Each project 
illustrates the importance of author- 
ity control in managing organization 
data, highlights some advantages and 
drawbacks of current approaches, and 
informs the process undertaken at 
NCSU Libraries. 

The WorldCat Registry, which 
debuted in February 2007, has been 
marketed as a product that will allow 
libraries to manage the details of their 
identities (i.e., names, aliases, parent- 
child relationships, and IP addresses, 
among other elements) and make 
them available through a centralized 
database to a variety of third par- 
ties, including vendors, consortia, and 
other libraries. 13 This product hits 
on the critical concept that metadata 



about organizations is essential in a 
field where libraries and library ser- 
vice providers deal with many dif- 
ferent groups on a daily basis. Just as 
libraries benefit from tracking infor- 
mation about content -providing orga- 
nizations, they also have an incentive 
to ensure that subscription agents, 
vendors, and other service provid- 
ers receive consistent and accurate 
data about the libraries themselves. 
In addition to providing data about 
libraries, the WorldCat Registry 
allows entries for publishers and other 
groups working in the library sphere. 
The data contained in those entries 
suggests the registry may be of some 
use in creating organization authority 
files within ERM systems. 

Unfortunately, the WorldCat 
Registry suffers many of the same 
drawbacks that plague ERM data. 
Because organizations are responsi- 
ble for keeping up their own entries, 
inconsistencies and inaccuracies often 
appear in the data. For example, 
a search for "North Carolina State 
University" returns nine results, 
including one heading for the uni- 
versity's main library and separate 
headings for each of its four branch 
libraries. No parent-child or other 
linking relationships have been estab- 
lished between them, even though a 
central body governs all five. While 
the existence of less than a dozen 
variations on the NCSU Libraries may 
be a step up from the scores of vari- 
ant names that crop up in some ERM 
data, the standard of control needed 
for effective ERM functions is not 
met. In addition to concerns about the 
consistency of its data, the WorldCat 
Registry's self-maintenance policy also 
presents a challenge because with- 
out formal review and enforcement 
of content standards, many records 
for library-related organizations may 
remain incomplete or fall out of date. 
While it may serve as a useful refer- 
ence tool, it cannot provide the level 
of detail and consistency needed by 
those wishing to create name authority 
within an ERM system. 



OCLC's publisher name server, 
which is still in the research phase, 
presents a more tailored solution to 
the problem of organization name 
authority. The service aims to resolve 
variant publisher names to a single, 
authorized form and make available 
relevant data about each publisher, 
including its location, language, genre 
and format, subject areas, and par- 
ent and subsidiary companies. Lynn 
Silipigni Connaway, head of the proj- 
ect's research team, said that she and 
her associates originally viewed the 
project, much of which is being done 
algorithmically, as a data mining exer- 
cise. 14 As their work progressed, she 
realized the advantages the server 
could offer to many areas of librari- 
anship, chiefly collection intelligence 
and analysis. Additionally, the service 
has the potential to facilitate quality 
control in library catalogs, and may 
be of use to catalogers sometime in 
the future. While no prototype has 
yet emerged, Connaway reported that 
the publisher name server project has 
already generated a great deal of inter- 
est, including weekly inquiries from 
members of the library and publishing 
communities. 

While OCLC's publisher name 
server offers many features that could 
be helpful to ERM system users in 
establishing name authority, it can- 
not be considered a full or viable 
solution at this time. As of late 2007, 
Connaway said that the service would 
only resolve book publishers, in keep- 
ing with OCLC's focus on projects 
and services that address monographic 
titles and holdings. 15 Without attention 
to serial publishers, users will be left 
with an incomplete data set. Equally 
important, the publisher name author- 
ity is still under development, and 
many libraries might not be able to 
wait until a publicly accessible version 
of the application is released to begin 
exploring the use of name author- 
ity. While it may not be immediately 
compatible with efforts to establish 
authority control over organizations 
related to serials, OCLC's publisher 



98 Blake and Samples 



LRTS 53(2) 



name authority server nonetheless 
demonstrations a need for organiza- 
tion name authorities and may provide 
context for librarians whose methods 
and research have already prompted 
similar projects. 



Organization Name Authority 
at NCSU Libraries 

At NCSU Libraries, institutional 
needs and the unique requirements of 
E-Matrix have required the develop- 
ment of an organization name author- 
ity tool fully integrated into the ERM 
system. The need for this tool was rec- 
ognized nearly five years ago, when the 
original E-Matrix development team 
discussed the product's capacity to pro- 
duce sophisticated evaluative reports. 
The team realized that to produce, for 
instance, accurate reports of money 
spent sorted by publisher, vendor, or 
provider, the names of those organiza- 
tions needed to be consistent across 
all instances. Unfortunately, because 
organization name data is imported 
into E-Matrix from sources without 
authority control, the names found 
throughout the application would vary 
widely, as seen in figure 1, which illus- 
trates the multiplicity of names associ- 
ated with Elsevier. During that early 
period, creation of a name author- 
ity was proposed and approved as a 
solution to the problem of consis- 
tency, but implementation was post- 
poned because of uncertainty about 
that implementation and a lack of 
precedent for organization authority in 
traditional library data sources. 

As development of E-Matrix con- 
tinued, additional justifications for a 
name authority arose. Design of the 
licensing module required the capa- 
bility to link a license to specific titles 
through the licensor field. As with 
the reporting module, accuracy in the 
licensing module required a list of 
unique, authorized names from which 
users could choose when mapping 
a new license. For that list to exist, 



authoritative names would 
need to be assigned to all 
organizations imported into 
E-Matrix. 

Organization roles also 
strongly suggested the need 
for a name authority. The 
DLF ERMI report origi- 
nally defined the concept of 
roles by describing how an 
organization could occupy 
a number of them, such as 
vendor, provider, publisher, 
licensor, and so on. 16 By 
assigning one or more roles 
to an organization, an ERM 
system will avoid the need- 
less duplication and confu- 
sion that might result from 
creating separate entities for 
each organization in each 
role. 17 Again, this feature of 
E-Matrix could only func- 
tion properly if organization names 
were assigned consistently throughout 
the data. Assigning multiple roles to 
an organization would make no sense 
if several variants of that organization's 
name could be found elsewhere. In 
short, authority control was crucial for 
clean relationships between organiza- 
tions and roles. 

In time, the concerns about data 
consistency raised by these aspects of 
E-Matrix made clear that the devel- 
opment of a name authority was not 
optional. A primary goal of E-Matrix 
was to bring together works related 
to one organization regardless of the 
form that name took in the original 
bibliographic descriptions. Without 
authority control, such correlation 
would be impossible. In light of those 
requirements and despite the lack 
of precedent in the field, NCSU 
Libraries deemed name authority cre- 
ation a top priority and assigned its 
implementation to the library's meta- 
data and cataloging department under 
the supervision of the continuing and 
electronic resources librarian. The 
project began in the fall of 2006 and 
still continues. The name authority 
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Figure 1. Display of Elsevier Variants within the 
Organizations Module of E-Matrix 



project has passed through planning, 
design, and implementation phases, 
each of which will be described sub- 
sequently along with a summary of 
results to the present. 

Getting Started 

Because no library or library-related 
group had previously attempted to 
compile an authoritative list of orga- 
nization names within an ERM tool, 
NCSU Libraries' name authority proj- 
ect had to be developed in-house from 
scratch. The continuing and electronic 
resources librarian, Jacquie Samples, 
working as part of the E-Matrix prod- 
uct committee, developed a model 
for creation of the authority through 
a pilot project. By taking a small sub- 
set of organizations from the ERM 
data and assigning them authorita- 
tive names, she determined important 
specifications for the larger project, 
including a preliminary analysis of how 
assigning authorities would affect the 
library's ERM data, an estimate of 
the project's timeline, and expectations 
for the type of work that would be 
required to complete the authority. 
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The name authority pilot proj- 
ect assigned authoritative headings 
to 483 vendor names extracted from 
the library's integrated library system 
(ILS) in November 2006. Vendor 
names were chosen because they were 
deemed most useful for developing 
the licensing module, which had been 
identified as apriority around the same 
time. Working from a spreadsheet, the 
continuing and electronic resources 
librarian evaluated each vendor and 
chose a preferred name. Altogether, 
444 authoritative vendor names were 
selected on the basis of predominant 
usage within the library community 
and assigned to the group of original 
names. Thirty-nine names (8 percent 
of the original sample) were variant 
names that could be linked through 
the use of an authoritative heading. 
Cross references were also established 
during the pilot project to represent 
business relationships between ven- 
dors — for example, companies that had 
been purchased by larger entities — as 
well as variant names not native to 
the data set. Additional roles were 
also noted for many of the entries, 
although they could not be incorpo- 
rated into the ERM data at that time. 
The prevalence of variant names, 
complex business relationships, and 
multiple roles all provided strong evi- 
dence that the entire E-Matrix data set 
required authority control of organiza- 
tion names. 

At the conclusion of the pilot proj- 
ect in April 2006, Samples estimated 
that approximately forty hours had 
been spent assigning names and cross 
references, as well as making notes 
about interesting or unusual circum- 
stances. Using the estimated number 
of organization names in E-Matrix at 
that time (about seven thousand) she 
determined that to assign authorita- 
tive names to the entire data set would 
take 580 hours — about three and a 
half months of full-time work for one 
person. Given the project loads of 
the librarians in the technical services 
departments at NCSU Libraries, the 



project likely would take the better 
part of a year for one librarian to 
complete. 

In addition to the sheer amount of 
time that would be required to create 
a fully functioning name authority, the 
pilot project also revealed that much 
of the work would be composed of 
manual functions, including collocat- 
ing names that represented the same 
entity and determining authoritative 
headings for each group of names 
using library and serials industry 
resources. These tasks would require 
the expertise of a library staff member 
familiar with the nature of serials and 
common library-based information 
sources. To determine authoritative 
names, the individual performing the 
work would require a strong sense of 
the relationships that exist between 
libraries and organizations. The librar- 
ian also would need the ability to fol- 
low a set of guidelines that combine 
accepted authority practices with the 
needs of the university and the library. 
The guidelines would not be pre- 
scriptive, and the selection of names 
would require a strong element of 
judgment and a fluid and intuitive use 
of available tools. The ultimate goal 
would be to determine headings for 
each organization that best fit with 
the library's existing practices and the 
overall structure of the authority. In 
light of the nature of the name author- 
ity work, the possibility of selecting 
names in an automated fashion was 
rejected because it was uncertain if 
an automated system could effectively 
make the judgments needed to choose 
correct authoritative names. 

The demonstrated importance 
of the name authority project, along 
with the volume and intensity of the 
work it would require, convinced 
the E-Matrix committee and library 
administration that the authority tool 
deserved top priority. The continuing 
and electronic resources librarian and 
a programmer from NCSU Libraries' 
IT department were assigned as part 
of their E-Matrix work the tasks of 



designing a system to store and man- 
age name authorities and of using that 
system to assign authoritative names 
to each organization in the licensor, 
provider, publisher, and vendor roles 
within E-Matrix. An NCSU Libraries 
Fellow, Kristen Blake, was appointed 
to aid in the determination and assign- 
ment of authoritative names. Together, 
this group made up NCSU Libraries' 
E-Matrix name authority team. With 
the support of library administration 
and increased resources, the name 
authority had moved into the realm of 
the possible. 

Defining Structure 

Before the name authority project 
could be fully implemented, a module 
had to be designed within E-Matrix 
to manage preferred names and other 
authority data. The name authority 
team created a framework that could 
be integrated into the ERM sys- 
tem functions that had already been 
designed while remaining true to the 
vision of name authority established by 
the librarian working on the product's 
conceptual development. 

The original plan for the organiza- 
tion name authority within E-Matrix 
envisioned a record-based structure 
to link together related organiza- 
tion names and store descriptive 
data about each organization. Each 
authorized name would be stored 
on a record and connected to the 
work-level records of all resources 
featuring one of its variant names. 
The authorized name record would 
have the ability to store data rel- 
evant to the authorized heading as 
well as its variants. Variant names 
would remain in place on the work- 
level records to which they originally 
belonged, and those records would 
be used to store data unique to 
each variant. Additionally, the vari- 
ant names on each work-level record 
would be assigned one or more of 
the roles available in E-Matrix: licen- 
sor, provider, publisher, and vendor. 
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Figure 2. Envisioned Structure for the Name Authority Tool 



License records would also be linked 
to an authoritative name record 
through the work-level record and 
the nonauthoritative names associ- 
ated with that work. All of these 
data — titles of works, unauthorized 
names, and organizational roles — 
would loop back and display on the 
authoritative record. Figure 2 shows 
the relationships between records 
that link name authority headings to 
work-level records in E-Matrix. 

In addition to establishing associa- 
tions between resources and organiza- 
tions, the initial vision of the name 
authority tool also included the ability 
to store useful data on organization 
records. Authoritative name records 
would serve as the primary storage loca- 
tion for business history notes, internal 
remarks, vendor contact information, 
product trial details, and other infor- 
mation as needed. Nonauthoritative 
name records could also be used to 
store notes specific to a single variant 
or imprint. By incorporating a detailed 
record structure into the name author- 
ity, the library hoped to mimic the 
structure of established name authori- 
ties that serve as repositories of his- 
torical and local information, as well 
as guides for consistent and accurate 
data creation. 

The realities of implementing 
E-Matrix forced the library's program- 
mers and planners to apply a phased 
approach to the design of the name 
authority application. In the interest 
of getting the project started as quick- 
ly as possible, some of the features in 
the original proposal were assigned 
to later phases, and a name authority 
system was designed that included 
essential functions but did not yet 
incorporate more robust design fea- 
tures. The system links organization 
name data through unique identifiers 
in a relational database. A member of 
the name authority team can assign 
a name authoritative status, which 
places a property in the database 
record for that name, indicating that 
status. Once a nonauthoritative name 



is associated with an authoritative 
heading, database records for nonau- 
thoritative names and their roles link 
to that name algorithmically. When an 
authorized heading has been assigned 
to a certain organization name, all 
future instances of that name will be 
automatically subsumed into the exist- 
ing hierarchy. This essential feature 
will decrease the amount of mainte- 
nance necessary once the authority 
project has been completed. 

The inclusion of a formal record 
structure for storing historical and 
internal data was also transferred to a 
later phase in the design of the author- 
ity tool. In its current incarnation, 
instead of using formal records like 
those a cataloger might find familiar, 
the names within the authority func- 
tion more like related nodes without 
a formal record structure. E-Matrix 
searches the authority database on 
the fly and, when data intersects, the 
relationships between names and roles 
are displayed within the E-Matrix user 



interface or on a report. 

Within this design scheme, the 
system of roles assigned to each orga- 
nization acts as a temporary substitute 
for the hierarchy of business relation- 
ships envisioned for the authority. 
While the authority team cannot cur- 
rently assign, for example, Elsevier as 
the current owner of Pergamon Press, 
the roles assigned to each organiza- 
tion can be manipulated to produce 
reports that would reflect that same 
relationship. Because Pergamon may 
be assigned to a resource in the pub- 
lisher role, and Elsevier assigned to 
that same resource in the provider 
role, a report listing all titles for which 
Elsevier is the provider along with their 
additional roles will indirectly display 
the relationship between Elsevier and 
publishers, like Pergamon, that it has 
acquired. 

The current system allows 
E-Matrix to accomplish the primary 
task of collocating all resources asso- 
ciated with a certain organization, 
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Organizations > Name Authority > Select Organizations (Step 1 of 4) 

Keywords 

13 items found, displaying all items. 
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Duke University. Rare Book, Manuscript, and Special Collections Library 
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Figure 3. Initial Interface Display of the Organization Name 



though the lack of formal records 
delays the library's goals of using its 
ERM system to capture the syndetic 
structure of business relationships and 
use the ERM as a primary location for 
storing these data. The eventual addi- 
tion of that functionality to E-Matrix 
will allow the name authority project 
to move forward and more strongly 
fulfill the local needs identified early 
in the development of E-Matrix. 

Designing an Interface 

Once this structure was in place, the 
next step in the implementation pro- 
cess was the design of a convenient 
and intuitive way to assign and manage 
authoritative names. Since the start of 
the project, the name authority team 
has worked to design an interface 
within E-Matrix that would allow the 
authority to be easily accessed and 
manipulated. For this phase of the 
project, the technical services librar- 
ians contributed ideas for the design 
and functionality of the interface, 
while the programmer translated this 
vision into a series of interfaces, each 
enhanced and refined with feedback 
based on actual use. 

The first authoritative organization 



names were initially stored in a local 
database because the user inter- 
face for the E-Matrix authority had 
not been completed at the start of 
the project. Instead, the librarians 
assigning authoritative names used a 
Microsoft Access database for storage 
as they began working through a list of 
7,858 publisher names culled from the 
library's holdings in the SFX (Ex Libris' 
link resolver) knowledgebase. These 
publisher names were chosen not only 
because they provided sufficiently 
complex test data for the developing 
interface, but also because they repre- 
sented the names most desperately in 
need of authority control. (Publisher 
data from the library catalog were not 
initially included because much of it 
was duplicated in the SFX data, and 
the remaining publisher names could 
be controlled later by building on the 
headings already determined during 
the SFX phase.) The librarians evalu- 
ated each name and collocated it with 
others representing the same pub- 
lisher, recorded their decisions in the 
database, and made notes specifying 
the justification for each decision and 
any problems that might need to be 
investigated when the E-Matrix inter- 
face became functional. This evaluative 



period resulted in the determination 
of authoritative headings that could be 
used later, as well as suggestions for 
the design of the integrated interface 
planned for E-Matrix. 

Throughout this early stage, 
designing a name authority interface 
shared top priority status with assign- 
ing names. Based on input from librar- 
ians who had tested the interface, the 
E-Matrix programmers produced a 
rudimentary beta interface, which was 
then adopted on a trial basis as part of 
the process of assigning names. The 
interface allowed the name author- 
ity team to select, on one screen, an 
authoritative organization name, as 
well as a group of names that should 
fall under its domain, and record that 
relationship within E-Matrix. This 
interface allowed the authority team 
to transition from recording their deci- 
sions only in the local database to 
actually entering them into E-Matrix, 
where they could be used experimen- 
tally by library staff. The beta interface 
had many limitations, however, and 
the local database was maintained as 
a backup and a place to record com- 
ments and problems. 

The authority team continued 
working within E-Matrix, and the 
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Organizations > Name Authority > Select/Create Authoritative Organization (Step 2 of 4) 
Select Authoritative Organization 



Organization Dyke University, Sciiaoj of Law 



Create Authoritative Organization 



Authoritative Organization Name Create 
Previous Step 



Organizations > Name Authority > Confirm Organizations (Step 3 of 4) 

Duke University, School of Law will be assigned as the Name Authority for the Following organizaBon(s): 

• Duke University School of Law Journals 

• School of Law, Duke University 

• Duke University School of Law 

• Duke University, School of Law 



Organizations > Name Authority > Confirm Organizations (Step 4 of 4) 

Duke University, School of Law was assigned as the Name Authority for the following organization(s): 

• Duke University School of Law Journals 

• School of Law, Duke University 

• Duke University School of Law 

New Name Authority 



Figure 4. Remaining Steps in Assigning an Authoritative Organization Name 



team's observations helped the pro- 
grammers develop an improved name 
authority interface, which was released 
with E-Matrix 1.0 in December 2007. 
This interface allows names to be 
assigned using a simple four-step pro- 
cess. In step 1, the browse or search 
feature is used to identify and select 
names that will be collocated under a 
common authorized heading. Figure 3 
shows the results of a search for "Duke 
University" and lists the names that 
need to collocated. 

In step 2, an authoritative name 
can be chosen from among those 
selected or a new name entered man- 
ually. In step 3, the user can perform 
a final review of selections and assign- 
ments and submit them. The fourth 
and final step confirms the assign- 
ments made and offers the user a link 



to begin the process again. Steps 2-4 
are shown in figure 4. 

The new interface made the 
team s work easier because it included 
expanded display options that clari- 
fied whether an organization already 
had an authoritative heading assigned 
to it and what that heading was. 
The authoritative relationships were 
also displayed in more places within 
E-Matrix, making the product useful 
as a source of organizational data for 
the authority team and all library staff 
using the ERM. In comparison with 
the beta version, the first production 
interface was more intuitive, incorpo- 
rating browse functionality, a cleaner 
design, and the ability to correct errors 
by reassigning authoritative headings. 
With the addition of a notes feature 
in a future release, the team will 
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transition their workflow completely 
into the ERM system. 

Crafting Policies 

At the same time that an interface 
was being designed, the continuing 
and electronic resources librarian was 
working to create the practical guide- 
lines and specifications needed to 
choose authoritative names and assign 
them to organizations within E-Matrix. 
The most important deliverables were 
guidelines that ensured accuracy and 
consistency in the selection of authori- 
tative names and the identification of 
organizations that will fall under them 
and a set of tools to help apply those 
guidelines. 

Creating clear, accurate authori- 
tative headings that were useful for 
the NCSU Libraries staff has been 
the primary consideration in decid- 
ing how names should be assigned 
within E-Matrix. In the interest of 
local policies, the first step taken in 
establishing name selection guidelines 
was to consult the library's collec- 
tion managers. Because they were 
the people who would be making use 
of the authority data most frequently 
for collection evaluation, the authority 
needed to reflect their preferences 
and standards. The collection manag- 
ers indicated that their chief goal was 
to preserve the organization name that 
most directly reflected the intellectual 
content of a work. Most commonly, 
this directive affected the assigning of 
authoritative publisher names, which 
often are more directly tied to spe- 
cific content areas than a vendor or 
provider name. Thus, to maintain that 
connection, the original publisher of 
a title is almost always chosen as the 
authoritative heading, even if that pub- 
lisher has since merged with or been 
acquired by another entity. In other 
words, the "statement of responsibil- 
ity" takes precedence over any current 
business arrangements. Journals that 
have been published by Academic 
Press, for example, are kept under the 



53(2) LRTS 



Creating Organization Name Authority within an ERM System 103 



Academic Press heading, even though 
Academic Press has long been an 
imprint of Elsevier. 

In many cases, publisher state- 
ments that have been imported from 
the library catalog are long and con- 
voluted. Large societies often issue 
publications on behalf of smaller soci- 
eties and, in these cases, collection 
managers presume that the larger soci- 
ety is acting as a kind of benefactor, 
while the smaller society represents 
the creator of the journal's content 
and, therefore, the better choice for 
an authoritative heading. For exam- 
ple, a publisher statement reading, 
"Published for the International Union 
of Biochemistry by Elsevier" would 
be assigned the authoritative name 
"International Union of Biochemistry" 
because that group entity is presumed 
to be responsible for the biochemistry- 
related content of the titles associated 
with the organization. Elsevier's role 
in the creation of the material is not 
lost because it can be assigned to those 
resources as a provider and licensor. 

In another common scenario, 
groups of small societies publish a title 
jointly, and the publisher statement 
contains the names or two or three 
discrete entities with each given equal 
weight. E-Matrix's current functional- 
ity allows for only one authoritative 
organization heading to be assigned to 
each work. In these instances, every 
effort is made to determine which 
organization is chiefly responsible for 
the work in question and to assign 
authority accordingly. For example, 
considering the publisher statement 
"American Society for Environmental 
History and Forest History Society," 
the librarian determining the authori- 
tative name must conduct a thorough 
investigation that includes identify- 
ing and viewing publications linked 
to this statement, then reading front 
matter and publisher information to 
determine which publishing group is 
the primary contributor to the content 
of the title. 

Because NCSU Libraries holds a 



fair number of foreign language titles, 
the formatting of the authoritative 
names for these titles is also impor- 
tant to the library's collection manag- 
ers. They requested that all foreign 
language authorities using Boman 
script remain in the original language. 
Authority headings for titles using 
non-Boman scripts have been trans- 
lated into English rather than translit- 
erated. E-Matrix does not allow for the 
inclusion of non-Boman characters, 
and translations were determined to 
be clearer and easier to assign than 
transliterations. 

Beyond these few specific 
requests, NCSU Libraries' collection 
managers and the E-Matrix develop- 
ment team felt comfortable relying 
on traditional library resources for the 
determination of names. The Library 
of Congress Name Authority File 
( LC N AF ) (http ://authorities .loc. gov) 
has been used whenever possible as 
a source for preferred names, as long 
as they do not conflict with local cus- 
tomizations. If no heading is available 
from the LCNAF, trustworthy serials 
databases such as Ulrich's Periodicals 
Directory (http://ulrichsweb.com) and 
the ISSN Portal (http://portal.issn.org) 
have been used as secondary sourc- 
es. NCSU Libraries' sources corre- 
spond almost exactly to those made by 
Connaway and Dickey of the OCLC's 
monograph-focused publisher name 
server. 18 That project uses the LCNAF 
as the chief source of authorities, fol- 
lowed by Books in Print (http://books 
inprint.com) and the International 
ISBN Agency (http://isbn.org), mono- 
graphic counterparts of Ulrich's and 
the ISSN Portal, respectively. 

Unlike OCLC, which has formal- 
ized its choice of sources for name 
selection, NCSU Libraries has the 
flexibility to put its local needs first. 
Bather than make a hard and fast rule 
about the order of sources consulted, 
the librarians at work on the name 
authority project have the freedom to 
make decisions on the basis of what 
will best fit the library's needs. The 



decision to disregard name changes, 
mergers, and other business-related 
changes until richer syndetic functions 
can be incorporated into E-Matrix 
illustrates a fundamental application 
of the local needs principle. Another 
example is the decision to apply 
Anglo-American Cataloguing Rules, 
2nd ed., rules for formatting of com- 
plex names to all headings, regardless 
of their source. 19 This practice ensures 
consistency and, just as importantly, 
enhances the browsablity of organi- 
zation data. When all departments, 
research centers, publishing arms, and 
other offshoots of a major institution 
are grouped together using consis- 
tent formatting, finding and grouping 
resources emanating from a particular 
group becomes easy, while still retain- 
ing more detailed information about 
the subgroups involved in its produc- 
tion. E-Matrix's browse features, as 
well as the sorting capabilities of the 
reports module, are gready enhanced 
by this practice. 

Librarians assigning authoritative 
names are also encouraged to use their 
knowledge of local practices to make 
case-by-case exceptions when neces- 
sary. Often, these types of decisions 
are used to resolve small quirks that 
might never be addressed by a stricter 
set of rules. For example, the authori- 
tative name chosen for the Institute of 
Electrical and Electronics Engineers 
is IEEE, even though the LCNAF 
has authorized the full version of the 
name. While working through the list 
of organizations, the continuing and 
electronic resources librarian recog- 
nized, for example, that the acronym 
IEEE was simply more recognizable 
to library staff than the organization's 
rarely used full name. The full version 
of the name will not be lost from the 
name authority data, however, because 
it is stored as a searchable cross refer- 
ence. On both the conceptual and 
practical levels, the flexibility to tai- 
lor the name authority specifically to 
NCSU Libraries' interests has been 
essential to its success. 
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Equally important as the authori- 
tative sources are other sources that 
provide insight into unique or prob- 
lematic names not addressed in the tra- 
ditional library databases. Many of the 
publisher names came in the form of 
obscure initialisms, and, in these cases, 
the website Acronym Finder (http:// 
acronymfinder.com) suggested leads 
that eluded typical research sources. 
In general, the Web search engines 
proved vital to researching the kind of 
obscure names that are routinely not 
found in the LCNAF. Often, use of 
E-Matrix itself was necessary to find 
the name of one or more journal titles 
associated with a specific publisher, 
and then a Web search engine was 
used to trace those titles back to the 
primary source. Viewing the publisher 
name in a table of contents or on an 
authoritative website provided an extra 
level of confidence in the decision- 
making process for publishers not 
found in an established source. 

The combination of an E-Matrix 
interface designed with user input and 
a set of fluid guidelines for name selec- 
tion has made the day-to-day work of 
assigning name authorities a smooth 
and intuitive process. While unexpect- 
ed and unusual names can slow what 
has become a speedy process, these 
are usually resolved through discus- 
sion and attention to the established 
guidelines and the library's needs. 
Again, the flexibility of the assigning 
process and the value placed on librar- 
ians' judgment have been essential to 
the implementation of this original 
and complex project. 

Organization Name Authority 
at Work 

The tools and procedures needed to 
create a functional organization name 
authority within E-Matrix were put 
into place by summer 2007. As of this 
writing, practical implementation of 
the name authority project has been 
underway for nearly a year. During 
that period, the NCSU Libraries 



has focused on the establishment of 
authoritative names for organizations 
stored in E-Matrix and in the use of 
those names to enhance data integrity, 
licensing procedures, and reporting 
capabilities. 

As of March 2008, the name 
authority team had evaluated 1,319 
organization names, not including 
those names originally evaluated in the 
pilot project. After authority control 
was applied, this group of names was 
reduced to 532 authorized organiza- 
tion names, a 59 percent reduction. 
These results were significantly more 
dramatic than the 8 percent drop seen 
in the pilot project, but that discrep- 
ancy can be explained by the post- 
pilot focus on normalizing the library's 
big deal packages. For instance, 58 
Elsevier variants were reduced to only 
one authoritative name, a 98 percent 
reduction in the number of variants 
and errors, contributing to a more 
dramatic decrease overall. Similarly, 
the 532 assigned authoritative names 
relate to 21,672 titles, a substantial 
percentage of the more than 35,000 
unique manifestations of resources 
currently managed through E-Matrix, 
further confirming the widespread 
effects of controlling the names of 
major organizations. 

On the broadest level, the intro- 
duction of authority control of names 
into E-Matrix has demonstrated pro- 
gression toward cleaner, more usable 
implementation of data on the basis 
of the concept of roles as defined in 
the DLF ERMI. By eliminating dupli- 
cation, variation, and errors in the 
organization data, E-Matrix enables 
use of role relationships. Titles that 
share a common publisher, vendor, 
or provider can now be identified 
using the authoritative name that links 
them to a single organization. Without 
authority control, an individual would 
have to manually account for each 
organization variant every time he or 
she worked with ERM data. 

The E-Matrix licensing module 
also benefits from the use of clean 



data. As the library begins its license 
mapping process, human data entry 
will be used to map the details of a 
license into structured data elements 
within E-Matrix. Each organization 
name entered into the license form 
must correspond to the correct serial 
resources. To ensure that a license 
is correctly applied to all resources 
whose terms it dictates, the licensing 
module will allow only authoritative 
names to be entered into the licensor 
field. Limiting the available licensors 
within E-Matrix preserves the appro- 
priate use of roles and relationships 
throughout the data. 

The benefits of clean role relation- 
ships can be seen even more substan- 
tially in E-Matrix's reporting module. 
The name authority team, in collabo- 
ration with collection managers and 
programmers, has begun to test the 
capabilities of the module to incorpo- 
rate authoritative names in ways that 
enhance the comprehensiveness and 
flexibility of reports. An authoritative 
publisher report displays every serial 
title in E-Matrix associated with a 
publisher. This very basic report serves 
mainly as a test object to illustrate 
how authoritative names have been 
incorporated into the data. Within 
E-Matrix, users can link from this 
report to detailed displays of publisher 
and resource information, facilitating 
discovery of related organizations and 
titles. Using the report module's export 
tool to transfer the data to a spread- 
sheet or database, all resources pub- 
lished by the same entity can be easily 
identified, and groups of similar pub- 
lishers explored. A similar test report 
displays each authoritative organiza- 
tion along with its related titles. This 
report expands the authoritative pub- 
lisher report across all roles, allowing 
for a more complete picture of how 
organizations relate to works and high- 
lighting implicit business relationships 
by illustrating the multiple organiza- 
tions associated with a single resource 
through their respective roles. 

In addition to these preliminary 
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reports, the name authority team also 
has conceived more advanced report- 
ing using authoritative names. For 
example, by leveraging the subject 
categories that have been assigned to 
every resource in E-Matrix, reports 
can be generated listing the most com- 
mon authoritative publishers within 
any given subject area. That data could 
be combined with the money spent on 
each resource to provide a compre- 
hensive picture of the amount spent 
per publisher in a certain subject area. 
Without the authoritative name data, 
such advanced reporting would be 
much more difficult because the pro- 
liferation of variant organization names 
would make the process of identify- 
ing and collocating all instances of a 
publisher tedious and prone to error. 
The name authority team hopes to 
see staff from other departments take 
advantage of the clean relationships 
between organizations and resources 
to produce analogously sophisticated 
custom reports. 

To ensure continuing data integ- 
rity across all modules of E-Matrix, the 
organization name authority project 
will continue to be maintained and 
enhanced once the primary authority 
project has been completed. Monthly 
maintenance reports will list any new 
organization names that have come 
into E-Matrix since the authority was 
last updated. Organization names with 
existing authorities will automatically 
be subsumed under the proper head- 
ing. In this way, the library will retain 
control over organization data within 
E-Matrix. 

Conclusion: The Future of 
ERM Systems and Name 
Authority 

The name authority project marked 
the start of the NCSU Libraries' appli- 
cation of authority control on data 
within its ERM system. The demon- 
strated need for authority control at 
NCSU Libraries and indications of 



similar thinking at other institutions 
make clear that authority control is 
poised to emerge as an important issue 
in the serials and electronic resources 
field. As this issue evolves, NCSU 
Libraries aims to improve local prac- 
tices and participate in initiatives that 
span the library community. 

The name authority team plans 
to pursue the enhancement of the 
existing tool and its functions within 
E-Matrix. An ongoing development 
priority is the expansion of the tool's 
structure to more closely align it with 
the vision established at the outset of 
the project, namely the creation of a 
full record structure that would allow 
for the storage of contact informa- 
tion for technical and sales repre- 
sentatives, details of product trials, 
and other internal notes as needed. 
Such detailed records will support 
the library's goal of creating an ERM 
system that facilitates storage of the 
myriad details of transactions related 
to serials and electronic resources. 

Equally important will be the 
creation of a hierarchical structure 
to identify and describe business 
relationships between organizations. 
These connections may be represent- 
ed through simple linking relation- 
ships similar to the role relationships 
that link organizations to resources. 
Among the relationships suggested 
for this structure include business- 
centered relationships such as "pur- 
chased by," "merged with," or "split 
from." Alternately, a complex, record- 
based structure could provide a more 
sophisticated representation of the 
series of acquisitions and mergers that 
characterize the serials industry. The 
ability to link organizations in a hierar- 
chy would result in a dynamic, family 
tree-like structure more illustrative 
than the flat linking structure cur- 
rently used in the assignment of roles. 
In either case, the capture of business 
data about serial publishers remains a 
top priority. 

In addition to enhancing the struc- 
ture of the E-Matrix authority tool, the 



name authority team also intends to 
evaluate and streamline the process of 
investigating and assigning authorita- 
tive names. With the project under 
way and procedures in place for the 
determination of names, several strat- 
egies may help the name authority 
team in its task of evaluating thou- 
sands of organizations. In addition 
to adding staff to the project, algo- 
rithmic text analysis of organization 
name data offers several potential new 
courses for the name authority project. 
While the name authority team ini- 
tially dismissed the prospect of using 
an automated process to parse orga- 
nization names and choose the most 
appropriate heading, immersion in the 
process has shown that the vast major- 
ity of organization names are small 
publishing companies, self-publishers, 
and associations — many of whom are 
responsible for the publication of only 
one resource in the library's collec- 
tion. Evaluating each of these types of 
names one by one has been extremely 
time consuming and does not make 
the best use of library staff resources. 

One option is to use textual analy- 
sis to identify similar names, choose a 
likely authoritative name, and assign 
that name as a heading. Another would 
be to algorithmically group similar 
names, but then manually choose the 
authority. In both cases, the machine- 
based solution would result in rough 
authority control over a large set of 
rarely used organizations. In either situ- 
ation, the E-Matrix name authority tool 
still would enable any authority to be 
manually evaluated and changed upon 
request. Any organization names not 
included in the textual analysis would 
be assigned a priority and worked on 
as time allowed. No decisions have yet 
been made on the role of textual analy- 
sis in the creation of authority head- 
ings, and additional consideration will 
be necessary before the name author- 
ity team changes its policy of manual 
evaluation for all organizations. 

Finally, the name authority team 
plans to turn its efforts to the use 
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of authoritative headings in E-Matrix 
end-user displays. E-Matrix has not 
yet taken advantage of the potential 
of authoritative headings to improve 
the interfaces used by staff across 
the library. In many display functions, 
E-Matrix continues to use nonau- 
thoritative organization names where 
authoritative names would produce 
a clearer, more coherent view of the 
library's serial holdings. To make 
the best use of the data being cre- 
ated, the name authority team plans to 
take a comprehensive look at each of 
E-Matrix's modules, determine where 
and how organization names are used, 
and outline the most effective type of 
display name to use in each circum- 
stance. 

In addition to enhancing the local 
uses of its name authority work, staff at 
the NCSU Libraries also aim to explore 
growing awareness in the library field 
of the need for authority control in the 
context of ERM. Involvement beyond 
NCSU Libraries so far is preliminary 
but has the potential to take many 
forms, including data and technology 
sharing with other libraries, collabora- 
tion with vendors in related efforts, 
and monitoring of and participation 
in standards groups that examine the 
issues of organizational identities. 

The responses to the survey con- 
ducted for this paper, as well as in 
feedback gathered from a presentation 
about the E-Matrix name authority 
tool at the 2008 Electronic Resources 
and Libraries conference in Atlanta, 
have demonstrated broad recognition 
of the need for organization author- 
ity control. Feedback indicated that 
many electronic resources librarians 
have begun to view development of 
reporting aspects of ERM tools as an 
important consideration for the future. 
These librarians have acknowledged 
that an organization name authority 
would be very useful for reporting 
functions. They have also made known 
that the data contained in such a 
name authority would have value out- 
side of ERM systems as a reference 



source for all librarians who work with 
vendors, publishers, and providers. 
Because many libraries do not yet have 
the experience or resources neces- 
sary to implement organization name 
authority, the librarians who contrib- 
uted their opinions to this project 
expressed a strong interest in sharing 
organization name authority data and 
the tools used to manage and create it. 
As development of the name author- 
ity tool at NCSU Libraries continues, 
the team will work with the E-Matrix 
administrative group to explore pos- 
sible methods for sharing the data, the 
tool, or both. 

In addition to librarians, vendors 
have also emerged as supporters of 
this enterprise. The publisher name 
authority and WorldCat Registry prod- 
ucts under development at OCLC 
represent the efforts of a major library 
service provider to establish authorita- 
tive identities for monographic pub- 
lishers as well as for libraries and 
consortia. The Openldentify Look- 
Up Service (http://openidentify.com), 
a product recently developed by the 
journal supply chain support provid- 
er Ringgold, marks another vendor- 
driven effort to establish widespread 
organization name authority control. 
Openldentify has assigned unique 
identifiers to more than one hundred 
thousand subscribers to academic 
journals and organized them hierar- 
chically. Like the WorldCat Registry, 
Openldentify approaches the problem 
of organization name authority from 
the perspective of a service provider 
with a need to control the identities of 
its clients: universities, libraries, cor- 
porations, and other information insti- 
tutions at the purchasing end of the 
serials and electronic resources trans- 
action. The dataset of authoritative 
publisher, vendor, and provider names 
would be an ideal complement to 
vendor-created name authorities con- 
trolling serials purchasers. Together, 
these two authorities would cover both 
ends of the serials transaction. 

Standards bodies will also be a 



presence in any organization name 
authority efforts that span the library 
and information field. The National 
Information Standards Organization 
(NISO) has approved the formation 
of a working group dedicated to the 
creation of a standard for institutional 
identifiers for libraries and publishers. 
The group will establish the metadata 
elements required for use of institu- 
tional identifiers as well as develop use 
cases for the standard. 20 The NISO 
work will continue efforts begun by 
the Journal Supply Chain Efficiency 
Improvement Pilot (http://journal 
supplychain.com), a collaborative proj- 
ect dedicated to exploring the benefits 
of institutional identifiers and pre- 
liminary implementation strategies. 
By involving a standards group like 
NISO in the process of creating and 
disseminating authoritative names, 
the project will gain legitimacy and 
avoid conflicts of interest that could 
arise from close association with a 
single vendor or institution. While this 
level of development for organization 
name authority initiatives is still in its 
infancy, the inclusion of such projects 
in the agenda of standards groups 
recognizes the need for and is a step 
toward a solution that could integrate 
the work already begun by a variety 
of groups. 

The implementation of an organi- 
zation name authority to enhance elec- 
tronic resources collection intelligence 
has been an innovative and successful 
venture at NCSU Libraries. Imposing 
authority control on the organization 
name data within the ERM system 
has laid the groundwork for greater 
precision and comprehensiveness in 
NCSU Libraries' reporting and collec- 
tion analysis efforts as well as contrib- 
uted to the creation of cleaner, more 
accurate data. Exploration of name 
authority practices throughout the 
library and vendor communities have 
confirmed that a need exists for the 
control of organization data and dem- 
onstrated that opportunities abound 
for collaboration and enhancement 
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of the original project. Management 
of serials and electronic resources is 
often a complex and difficult endeavor. 
NCSU Libraries' organization name 
authority illustrates the power of cre- 
ating an ERM tool to meet specific 
local requirements and the potential 
benefits of expanding it to address the 
broader needs of the library and infor- 
mation community. 
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Descriptive Metadata for 
Digitization of Maps in Books: 
A British Library Project 
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Hidden special collections are increasingly being made visible and accessible 
by small digitization projects. In the project described in this paper, the British 
Library employed existing library standards and systems to accomplish key func- 
tions of a project to digitize a selection of maps contained within rare books. The 
integrated library system, using the Anglo-American Cataloguing Rules (AACR) 
and Machine-Readable Cataloging (MARC) format, acted as a lynchpin, linking 
directly bibliographic descriptions of both the original and the digital copies of the 
map, the book containing the map, the digital image, and preservation data and 
strategy, making the items widely searchable and visible while uniting them with 
the broader collections. 
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In tandem with the surge of mass book digitization projects has been a move- 
ment to highlight small special collections with digitization and cataloging. The 
Library of Congress (LC) Working Group on the Future of Bibliographic Control 
recommended as a priority enhancing access to rare, unique, and special hidden 
materials, encouraging digitization and creation of detailed descriptions, as well as 
integrating access to these materials with wider institutional holdings. 1 With the 
capabilities of today's library systems, a surprisingly large number of these tasks 
are possible in many libraries using existing library skills and resources. 

In the project described in this paper, the British Library (BL) employed 
existing and emerging library standards and systems to accomplish key functions 
in a project to digitize a selection of maps and views contained within rare books. 
While the project involved a number of stages and areas of expertise, this paper 
will explicate the manner in which the authors handled the need for descriptive 
metadata identifying the item and its source, documenting copy-specific attri- 
butes, and making the record and its digital surrogate accessible. The main library 
system in the BL, the Aleph 500 integrated library system (ILS) produced by Ex 
Libris, acted as a lynchpin, linking directly bibliographic descriptions of both the 
original and the digital copies of the map and the book in which the map appears, 
the digital image files captured, and the preservation strategy, making them wide- 
ly searchable and visible while uniting them with the broader collections. This 
project represents the first use in the BL of the digital asset module in Aleph. 



Description of the BL Project 

The Vulnerable Collection Items Project was undertaken at the BL to select, digi- 
tize, and collect metadata for maps held within the rare printed books collection. 
Following thefts of valuable maps contained within books from multiple institu- 
tions that included the BL, it was thought that a method should be developed 
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to firmly identify the unique copies 
of rare and important BL holdings 
to better protect valuable collection 
materials considered vulnerable. The 
resulting process combined sets of 
high-resolution security photographs, 
bibliographic metadata to describe the 
physical object (which includes copy- 
specific descriptive metadata such as 
condition descriptions), metadata for 
the digitized image, and linking this 
and the image to the bibliographic 
metadata. This enabled the highest 
possible level of identification of dis- 
tinguishing features that existing BL 
systems can accommodate, improving 
the security of the selected maps. 2 

The original, security-oriented 
project aims eventually blossomed into 
something of more universal use and 
wider research value. Having acquired 
digital photographs of the collection 
items and associated metadata, it 
became clear that sharing the infor- 
mation would contribute to accom- 
plishing other BL strategic priorities. 
The project could serve to answer 
the library's security concerns while 
enhancing user access to the collec- 
tions by providing publicly accessible 
metadata for, and images of, the maps 
under consideration. The advantages 
of revealing these hidden collections 
were deemed to far exceed the poten- 
tial pitfalls inherent in extending the 
project's aims. 



Literature Review 

In the plethora of funded digitiza- 
tion projects throughout educational 
and cultural sectors, visual collections 
identified as "otherwise hidden" have 
been well represented in recent years. 
Methods for capturing metadata dur- 
ing digitization projects for such spe- 
cial collections have been plentiful 
in the current literature, represent- 
ing manuscripts, ephemera, fanzines, 
remotely sensed imagery, original art, 
architectural images, posters, and 
postcards. 3 



Maps are no exception to this 
attention, with numerous scanning 
products using a variety of stan- 
dards and methods for metadata 
capture evident on the Web. The 
American Library Association Map 
and Geography Bound Table Map 
Scanning Begistry, ongoing since 
2006, is the primary online listing of 
map scanning projects. 4 This constant- 
ly updated source provides outline 
information (prepared by the project 
owners) about the projects, describing 
the content, technical standards used, 
and metadata captured. Though most 
of the projects represented are either 
not collecting metadata or have not 
provided this information in the reg- 
istry, those that have done so list the 
Federal Geographic Data Committee 
MetadataStandard(FDGC), Machine- 
Beadable Cataloging (MABC) 21, or 
Dublin Core (DC) standards, which 
are widely adopted by metadata librar- 
ians for digitizing projects and born- 
digital data collections. 

MARC as a Metadata Solution 

The use of MABC and the Anglo- 
American Cataloguing Bules (AACB) 
has been a less popular approach 
for capturing descriptive metadata for 
special format digitization projects. 
MABC as a tool in digitization proj- 
ects has been criticized in the past for 
being "too complex, requiring highly 
trained staff and specialized input 
systems," and for being too focused 
on print material and not extendable 
for digital collections. 3 More recent 
reviews and comparisons have looked 
upon MABC more favorably. Beall 
outlines twelve criteria for comparing 
metadata schema, suggesting numer- 
ous advantages for the use of MABC 
in library projects. 6 Significant among 
Beall's criteria is the availability of 
systems and software to support any 
given metadata scheme, meaning that 
MABC metadata can be created and 
searched in library ILSs, a desirable 
feature often taken for granted. Layne 



reported as early as 1991 the use of 
the MABC format for a digitization 
project of images in medieval manu- 
scripts. 7 Becognizing that MABC is 
very often not the method of choice 
for manuscript materials, her primary 
question related to the usefulness of 
MABC for this purpose, which was to 
provide description and access, while 
applying widely known and accepted 
standards. She concluded that the 
flexibility of the format was effective, 
and it continues to be used. Other 
small, special format digitization proj- 
ects described in the literature using 
MABC include the joint University 
of Pennsylvania Library-Cambridge 
University Library project in 2006 
to create online catalog records and 
an image database on the Web of 
dispersed manuscript fragments and 
the mixed collections of ephemera 
(Pennsylvania German broadsides and 
Fraktur) taking place at Pennsylvania 
State University Libraries. 8 The for- 
mer selected MABC principally to 
integrate the descriptive metadata 
into the existing library system while 
the website uses a crosswalk to con- 
vert the records to DC. The latter 
focussed on the challenges of using 
the multiple sets of cataloging rules 
accompanying formats of monograph- 
ic broadsides, graphic materials, and 
manuscripts. 

These cases represent compara- 
tively small projects limited to finite 
collections. MABC is also applied for 
ongoing, nonproject-based digitization. 
Two major institutions using MABC 
for ongoing maps digitization are the 
LC Geography and Maps Division, as 
part of the American Memory Project 
(http://memory.loc. gov/ am mem), and 
the Harvard Map Collection (http:// 
hcl.harvard.edu/libraries/maps). 
Both of these use MABC records 
for descriptive metadata and to pro- 
vide access to scanned map images 
from the online public access catalog 
(OPAC) through hyperlinking. When 
the link in the OPAC is selected, an 
external viewer is launched that allows 
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interactive features with the image, 
such as panning and zooming. 

Previously published research 
relating experience with the use of 
MARC for special format digitiza- 
tion projects enriches understanding 
of the benefits of using MARC as a 
format and the challenges and meth- 
ods of interpreting Anglo-American 
Cataloguing Rules, 2nd ed. (AACR2) 
cataloging standards for such collec- 
tions. 9 Thus far, however, no detailed 
practical reports on the way in which 
MARC analytics can be applied to 
reveal visual materials hidden with- 
in another bibliographical unit (e.g., 
maps in books) or how bibliographical 
data would be structured to accommo- 
date this have been published. 

Other Metadata Solutions for 
Digitized Maps 

Many digitization projects are employ- 
ing metadata standards specifically 
designed to capture information about 
digital image data, and these schemas 
reflect the flexibility possible in the new 
and continually emerging systems used 
to manage them. Projects handling car- 
tographic materials, in common with 
wider practice, use any number and 
combination of standards and methods 
of capturing metadata, including the 
well-established DC, MARC, Federal 
Geographic Data Committee (FGDC), 
Encoded Archival Description (EAD), 
Metadata Encoding Transmission 
Standard (METS), and Metadata 
Object Description Schema (MODS). 
The projects including maps are too 
numerous to detail comprehensively, 
so a select few are highlighted to 
illustrate the diversity and flexibility of 
solutions being developed. 

In many cases, a defined scheme 
is adapted to the project. In the case 
of the Collaborative Digitization 
Program (originally the Colorado 
Digital Program), no one standard 
was deemed appropriate; the par- 
ticipating institutions' existent meta- 
data schema were all mapped to a 



minimal DC element set in order 
to facilitate efficient crosswalking of 
metadata from the archival, museum, 
and library collaborators. 10 The same 
type of amalgamation was applied by 
the project librarian Nicholas Graham 
for North Carolina Maps (www.lib 
.unc.edu/dc/ncmaps), a collaborative 
project merging images and records 
from library and archives catalogs in 
both MARC and EAD. 11 The data 
from existing catalog records were 
downloaded to a spreadsheet, addi- 
tional fields were added, and the plan 
is to eventually export these data to 
MODS. Such a method requires con- 
sistent mapping between the various 
metadata fields, and Graham's work 
crosswalking between four standards 
is invaluable. 

In other cases, more than one 
set of metadata is captured, allowing 
different standards for different pur- 
poses. METS, an Extensible Markup 
Language (XML)-based schema 
for packaging related sets of digital 
objects, was used for digitised Sanborn 
maps at University of Colorado at 
Boulder Libraries, but only after 
MARC records were created for the 
digital and analog versions in the 
library catalogs, with the data then 
converted to XML. 12 In combination 
with locally developed tools, the proj- 
ect used MarcEdit, a freely available 
utility developed by Terry Reese at 
Oregon State University for batch edit- 
ing and converting MARC between 
formats. 13 The same tool was used by 
Brenner in her innovative project with 
the Oregon Sustainable Community 
Digital Library (http://oscdl. research 
.pdx.edu) to merge metadata from dis- 
parate contributors, display scanned 
materials through Google Earth, and 
provide MARC metadata, all directly 
from the library's OPAC. 14 

Scanned maps that are converted 
to geospatial data, as in McGlamery's 
monumental distributed project of 
scanned and geo-referenced topo- 
graphic maps of the Austro-Hungarian 
Empire, require more specialised 



content standards. 15 Although the 
International Organization for 
Standardization standard for geograph- 
ic information (ISO 19115) was con- 
sidered, FGDC's Content Standard for 
Digital Geospatial Metadata, with its 
antecedents in MARC, was ultimately 
selected, and a customised application 
was developed for metadata input. 

These standards are used effec- 
tively with a host of new commercial 
software products devised to manage 
metadata collection, discovery, distri- 
bution, and display of digitized imag- 
es. Referred to collectively as Digital 
Visual Information Management, 
these systems include library OPACs, 
content management systems, digital 
asset management systems, and digital 
repositories. 16 

Managing Digitized Content 
in the BL 

A number of approaches are currently 
being taken to manage digitized con- 
tent in the BL. Although MARC and 
AACR2 are used for the majority of 
BL cataloging, the BL has used dif- 
ferent standards for specific circum- 
stances, and a combination of in-house 
BL work and components provided by 
third parties for metadata and systems 
is usually used. 

The BL Application Profile 
(BLAP), an extended DC-based 
declaration of descriptive metadata 
terms encoded as XML, was devel- 
oped to support a high-level cross- 
searching facility among BL resources 
of different types and with different 
metadata formats, and has been used 
in several large BL digitization proj- 
ects. An early implementation was 
Collect Britain (www.collectbritain 
.co.uk), one of the largest digitization 
projects thus far carried out by the 
BL, the aim of which was to digitize 
a selection of historic content from 
several BL collection areas. BLAP 
metadata was stored in an Structured 
Query Language (SQL) Server 2000 
database and digital objects in a file 
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store, using content management sys- 
tems to upload and deliver Web con- 
tent. Another large project, British 
Newspapers 1800-1900 (www.bl.uk/ 
reshelp/findhelprestype/news/news 
pdigproj/ndproject), digitizing up to 
two million pages of British nation- 
al, regional, and local newspapers, 
used a customization of BLAP and an 
Open Archives Initiative for Metadata 
Harvesting (OAI-PMH) data provider 
service for interoperability require- 
ments within third party content man- 
agement systems. BLAP also has been 
used to describe sound recordings in 
the Archival Sound Becordings project 
(http://sounds.bl.uk) and to provide 
a metadata standard for use with BL 
Web resources. 

Many small, discrete digitiza- 
tion projects fall into the Themed 
Collections Programme, defined as 
"systems developed using a standard 
software architecture designed to hold 
varied and complex data and allow it 
to be searched, edited, and presented 
in various 'themed' ways, usually on 
the internet." 1 ' These were devised 
to provide cataloging and a search 
interface for collections considered 
incompatible with the ILS; biblio- 
graphic data is stored in XML in SQL 
Server 2005. The Themed Collections 
has been used successfully for sev- 
eral BL projects that mix metadata, 
text, and images. These include data- 
bases of, for example, Benaissance 
Festival Books (www.bl.uk/trea 
sures/festivalbooks/homepage.html), 
Database of Italian Academies (www 
.bl.uk/catalogues/ItalianAcademies), 
and Historic Photographs (www 
.bl.uk/onlinegallery/features/photo- 
graphicproject/index.html). In addi- 
tion to resource discovery through 
the BL website, the system facilitates 
data exchange with other organiza- 
tions, links to digital objects, and item 
requesting. METS has already been 
used in the BL as a "wrapper" for the 
Archival Sound Becordings project 
(http://sounds.bl.uk) and is now also 
being used to package the various 



types of metadata associated with 
e-journals (e.g., MODS, PBEservation 
Metadata: Implementation Strategies 
(PREMIS)). The use of METS will no 
doubt be extended to other content 
types. 

OAI-PMH is currently being 
investigated by the BL to harvest data 
from digital objects stored in the BLs 
Digital Library System so that it can be 
used for a variety of resource discovery 
initiatives such as the European Union- 
funded Europeana project (http://dev 
.europeana.eu). OAI-PMH is still 
being used in the BL only for specific 
projects such as these, and although 
not ready to be used for this project, 
the BL plans to expand its usage into 
more general areas. 

Selection of a System 

The need to capture and organize 
descriptive metadata to accompany 
the digitized map images meant that, 
in order to be effective, the system 
needed to 

• ingest the description of the 
map, its bibliographic source, 
and the individual copy condi- 
tion; 

• accommodate electronic search- 
ing and access to the records 
and potentially images, ideally 
linking the two; and 

• ensure institutional long-term 
maintenance, preservation, and 
technical support. 

At first glance, the Themed 
Collections system seemed appropri- 
ate, since it had been used at the BL in 
the past to manage images and meta- 
data associated with special collections. 
With further examination, however, the 
ILS was chosen, for several reasons: 

• Avoiding the unnecessary cre- 
ation of a new software or 
website-specific database for 
this project was considered of 



paramount importance. The 
use of Themed Collections soft- 
ware and hardware would have 
necessitated building and popu- 
lating a new database, which 
would have been costly and 
time-consuming. 

• Bibliographic records for the 
books containing the images 
were already in the ILS, as 
were some records represent- 
ing individual maps contained 
within books or atlases. 

• The ILS supported the use of 
analytical bibliographic records 
to describe discrete elements 
contained within bibliographic 
units (e.g., a map within a book) 
by means of "child" analytical 
bibliographic records linked 
to the "parent" bibliographic 
record representing the work 
containing the map. 

• The ILS possessed functionality 
to link the digital images to the 
metadata, so it could in theory 
present a complete representa- 
tion of the image to the user. 

• Besource discovery by the pub- 
lic was already possible through 
the OPAC and allowed the flex- 
ibility to make the record public 
(or not). This was an important 
consideration as the project 
developed and the desired out- 
comes changed. 

• Because the ILS is the core cat- 
aloging and resource discovery 
system used by the BL, future 
support for records created for 
this project was guaranteed. 

• An infrastructure (the BLs 
Digital Library Programme) 
was already in place to support 
ingesting and preserving digital 
objects. 

Selection of Standards 

Several of the various standards 
described above appeared to be poten- 
tial solutions, including MABC 21/ 
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AACR2, BLAP, MODS/METS. The 
reasons why MARC 21/ AACR2 was 
chosen included the following: 

• DC-derived standards such 
as BLAP likely would have to 
be customized for this project, 
which would be time consum- 
ing, whereas the most common- 
ly used standards in the BL, 
MARC and AACR2, could be 
employed without modification. 

• AACR2 is the international con- 
tent standard used by the BL to 
describe much of its collection, 
and it is the standard used in 
current cataloging. In addition 
to printed books, AACR2 fully 
supports cartographic materials. 

• The MARC 21 bibliographic, 
authority, and holdings formats 
(all used by the BL) provide a 
way to express catalog records 
created according to AACR2 
standards in MARC format, and 
provides some additional infor- 
mation (e.g., coded data and 
content designation) used by 
computers to enhance access. 

• MARC is a proven standard; 
its stability and its granularity 
for descriptive metadata recom- 
mended it. With these quali- 
ties it can both operate well in 
current systems and easily be 
migrated in the future. 

• MARC is continuously growing 
to accommodate new techno- 
logical advancements, and so is 
equipped to handle the neces- 
sary hybrid of print and digital 
information. 

• MARC contains data elements 
for recording preservation 
actions. 

• MARC supports the expression 
of relationships between related 
items. 

• MARC and AACR2 enable the 
recording of information spe- 
cific to particular copies of a 
work. This means that unique 



characteristics of the map could 
be recorded, providing obvious 
benefits for collection security. 
• MARC and AACR2 are at 
the heart of mainstream BL 
cataloging. Thus records cre- 
ated or reused for this project 
using those standards would 
follow the same development 
path as most other BL cata- 
log records (e.g., forthcoming 
moves from AACR2 to its suc- 
cessor, Resource Description 
and Access (RDA), and from 
MARC 21 to XML-based 
MARC formats). 

In addition to AACR2, its collateral 
publication, Cartographic Materials: A 
Manual of Interpretation for AACR2, 
was used as the primary authority 
consulted for reference, along with 
"MARC 21 Format for Bibliographic 
Data." 18 Other standards used includ- 
ed Library of Congress Subject 
Headings and the NACO (Name 
Authorities Cooperative) Authority 
File. 19 Although AACR2 was used to 
construct the bibliographic records 
representing the images, the host book 
records would most often not be con- 
structed according to AACR2 because 
they were created long before AACR2 
was introduced. 

Because entire books were not 
scanned but only the maps, structural 
metadata was simple and could be 
noted in MARC. The level of techni- 
cal metadata, included in the Aleph 
Digital Asset Management (ADAM) 
metadata record representing the raw 
images, was deemed sufficient. 

Project Implementation 

Although the use of the ILS as well as 
MARC and AACR2 appeared to be 
the most suitable approach, several 
aspects were untested. For example, 
the BL had until then cataloged below 
the level of the item only in the case 



of conference proceedings and did not 
yet have a policy for recording copy- 
specific information. Additionally, 
the ADAM module, a priced add-on 
option to Aleph available beginning 
with version 16.03 that operates with- 
in the cataloging module, had been 
acquired by the BL but not exploited 
extensively; it allows small-scale (i.e., 
not a digital archive) management 
of digital objects within the Aleph 
environment. This project therefore 
presented a unique opportunity to 
test the feasibility of applying the 
BLs existing systems and standards to 
manage these complex facets of the 
project as well as a challenge to adjust 
local policy, technology, and practice 
to accomplish these ends. 

The authors wished to use exist- 
ing resources in the library, in terms 
of established bibliographic standards 
and technology, to integrate materi- 
als with the BLs larger holdings and 
to increase the items discoverability 
to users, whether they are searching 
for a citation or the digital image 
itself. The metadata structure, for- 
mat, values, and content needed to 
fit into the established standards of 
the larger institution to ensure that it 
would be supported in future poten- 
tial changes, such as migrations in 
the library system, shifting library 
standards, Web access, and technol- 
ogy. Cataloging this unusual medium 
(early cartographic images contained 
within books and their digital manifes- 
tations) required expanding how the 
BL currently used the standards and 
system. For the metadata segment 
of the project alone, it was neces- 
sary to draw on support and advice 
from numerous units of the library, 
including British Collections and the 
Map Library, for staffing, curatorial 
insight, and project management and 
to ensure the most up-to-date practic- 
es were used for map cataloging and 
digitization; Systems Management for 
problem solving and technical sup- 
port to enable the ILS to suit the 
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Figure 1 . This English book, The Discovery of New Brittaine, contains a folded map within 
titled "a mappe of Virgina discovered to ye hills." 



project's needs; Bibliographic and 
Metadata Standards and Data Quality 
and Authority Control to review and 
approve the template elements and 
functionality; and Resource Discovery 
and Applications Development for 
creative development of the system 
and policy decisions regarding access 
to additional modules. 



Operational Phase One 

During the first phase of the project, 
constituting an operational period of 
approximately nine months, more than 
three thousand maps of the world 
and of the Americas, produced by 
Europeans between the late fifteenth 
century and 1700, were selected for 
inclusion in the project. Figure 1 
shows a typical map included as it sits 
within its containing volume. It por- 
trays the mid-Atlantic coast, a region 
of intense interest to Western Europe 
in the seventeenth century, and is con- 
tained within a 1651 English text, the 
Discovery of New Brittaine (London: 
I. Stephenson, 1651), describing the 
"discovery of New Brittaine." The 
book will be included in the catalogs 
of most libraries that own it; the map 
(illustrated in figure 1) will not. In 
some cases, the maps selected already 
had skeleton catalog records in the 
system, whereas in other cases there 
was no catalog representation because 
the BL only selectively catalogs spe- 
cial format material (e.g., illustrations 
and maps) contained within books. 
The books in which the maps were 
held were already represented in the 
BL catalog, the majority with mini- 
mal records retrospectively converted 
from the printed catalogs. 

Staff for the cataloging portion 
of the project initially consisted of 
three individuals. The map curator 
designed the templates for the records 
and coordinated with other relevant 
teams in the library for policy deci- 
sions, advice, and to arrange required 



functionality. Two full-time project 
curators were employed to devote the 
majority of their time to cataloging. 
Between these two, exceptional exper- 
tise was brought on different fronts. 
One offered knowledge of the map 
literature and extensive experience 
with antiquarian maps, background 
appropriate to provide bibliographical 
reference citation notes, detailed con- 
dition descriptions and copy-specific 
information that would aid in identify- 
ing particular distinguishing features 
for each map. The other, a trained 
cataloger, brought adeptness with the 
library system, current standards, and 
the technology. Both had multilin- 
gual abilities, beneficial for handling 
the multitude of Western European 
languages. Above all, flexibility was 
an essential attribute for the project 
because the processes and technolo- 
gies were in many cases new to the BL 
and had to be developed and coordi- 
nated as needs arose. 

Descriptive Metadata: Map 

The data structure for a book in the 
ILS takes the form of a MARC 21 bib- 
liographic record for the book, a sepa- 
rate MARC 21 holdings record linked 



to the book record and containing data 
about the book's location as well as its 
physical condition and other aspects 
specific to the individual copy, and an 
Aleph-specific item record represent- 
ing the physical copy. The item record 
is linked to the holdings record. The 
presence of an item record introduces 
some conceptual problems because 
the holdings record represents the 
individual copy in MARC; however, 
item records are essential because 
Aleph administrative functions are 
carried out against them. 

The analytic bibliographic record 
representing the map does not have its 
own holdings record because holdings 
policy in the BL dictates that hold- 
ings below the item level may not be 
expressed. Instead, it is linked to the 
bibliographic record for the host item. 
This link enables viewing the location 
of the host item and requesting the 
item through the map analytic record, 
even though the holdings record is not 
linked directly to it. Figure 2 presents 
a model of this data structure. 

The project team realized quickly 
that individually cataloging each map 
would be the most efficient means of 
identifying each and recording its loca- 
tion and context. Creating a new record 
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for the map using the MARC format 
and the AACR2 standards for carto- 
graphic materials would capture the 
bibliographical information by which 
items might be searched. Additionally 
the linking functionality offered by 
MARC and the ILS would properly 
express the relationship between the 
map and its host item. 

The project template consisted of 
a set of core data elements. Common 
to other online catalog records for car- 
tographic materials, they were already 
accommodated in the ILS and fol- 
low AACR2 rules. Other fields, such 
as notes, added entries, and addi- 
tional subject headings, were added 
when appropriate. The LC access- 
level record standard was not specifi- 
cally considered for this project, but 
many data elements it includes were 
replicated. 20 Some data elements it 
contains were inappropriate for this 
project, for example the MARC link- 
ing fields 580 and 780 for reasons 
given in the following section. 

The appendix presents a list of 
the fields in the template for the ana- 
lytic bibliographic record for the map, 
with standard options and anomalies or 
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Figure 3. A Bibliographic Record for Map in Figure 1, As Seen within the Staff Interface 
of the BL's ILS 



features specific to the project noted. 
Numerous other fields appear in the 
records. The Aleph Linker (LKR) field 
expresses the link between the ana- 
lytic and the host record in the ILS. 
Among the copy-specific information 
fields used at the BL, the 562 Copy 
and Version Identification Note was 
employed, the development of which 
is described in the similarly titled sec- 
tion below. Additional required ILS- 
specific fields are also used. Figure 3 is 
an example of a completed record and 
describes the image shown in figure 1. 

Linking the Analytic Record to the 
Host Record 

Bibliographic records representing the 
pre-1700 books in which the maps 
were contained were already pres- 
ent in the ILS. This project did not 
require changing or upgrading these 
records in any way. 

The analytic bibliographic records 
representing the maps had to be cre- 
ated where they did not already exist 
according to the standard described 



above. Initially, the link between the 
child analytic map record and the par- 
ent book record was a tenuous one, 
built by manually entering the host 
and shelfmark as MARC fields 740 
(added entries for related or analytical 
titles), 773 (information concerning 
the host item for the constituent unit 
described), and 852 (location) in each 
analytic record. The linking functional- 
ity afforded by the MARC linking fields , 
although technically possible in Aleph, 
was not used in the BL implementa- 
tion; instead, the dedicated Aleph LKR 
field was used. The ILS can accom- 
modate several different types of links 
between records using the LKR field; 
for example, it is used to link the 
holdings record to the bibliographic 
record. The link that was used for this 
project was the "Up/down" analytic 
link between bibliographic records of 
different levels, in this case between 
the parent book record and the child 
analytic map record. 

Record system numbers, which 
are unique to each record in the ILS 
(and are used as the unique identifiers 
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throughout the project), are funda- 
mental to the linking process. Although 
the LKR field has the functionality 
to create the links, the system num- 
ber of the linked record in the LKR 
field identifies which record should 
be linked to which. The LKR field 
appears in the sample bibliographic 
record in figure 3. 

Systems Management assisted 
in the development of a macro for 
inserting and populating an LKR field 
into the child record to generate a 
hyperlink between the two automati- 
cally, and this was integrated into the 
cataloging workflow. By entering the 
LKR field in one record, the ILS 
functionality generates reciprocal links 
between the parent and child records. 
One effect of this is to expand the 
location information given in the hold- 
ings record for the book into the ana- 
lytic bibliographic record, facilitating 
requesting in the BLs onsite retrieval 
system. 

Descriptive Metadata: Copy- 
Specific Information 

Recording the condition description 
of each individual map was considered 
to be vital to identify unequivocally 
the unique copy of each image owned 
by the BL. This presented a chal- 
lenge as basic copy-specific informa- 
tion was previously only recorded in 
item records and inconsistently and 
sporadically in bibliographic and hold- 
ings records. In response to the spe- 
cific needs of this project and others 
throughout the BL, a new policy to 
enter copy-specific information at the 
holdings and bibliographic level in 
standard MARC fields was devised, 
allowing for such cases where ana- 
lytics are used to represent part of 
a work. 21 Following this policy, the 
condition description is transcribed 
in the 562 field of the analytic biblio- 
graphic record instead of in the hold- 
ings record (as analytics may not have 
their own holdings records). 



The content of the condition note 
(field 562) included the location of 
the map in the volume; description 
of paper, including location of water- 
marks or inequalities; printing, not- 
ing strength of impression, bleeding, 
offsetting, or plate marks; damage, 
such as stains, wormholes, tears, and 
repair work; or other markings includ- 
ing coloring or annotations. Because of 
the free-text nature of this field, a style 
sheet was developed with colleagues 
in the BLs Early Printed Collections 
to establish agreed vocabulary, abbre- 
viations, and punctuation. 

Identifying the copy as belonging 
to the BL within the relevant field 
was essential if the record was shared 
with another institution. Copy-specific 
fields in BL collection items are distin- 
guished by the shelfmark of the copy 
being described preceding all con- 
tent in the first subfield within each 
copy-specific field. This is because, 
although each note is linked to a single 
holdings record, it will be displayed 
in a bibliographic record that may be 
linked to several holdings. Also, as the 
details described only pertain to the 
BL copy, $5Uk is added to the end of 
each copy-specific field to identify the 
institution where the copy is held. 

Including an indication that the 
map has been digitized (with the 
project affiliation) in each record was 
desirable to ease retrieval of maps 
included in the digitization project. 
This information was recorded in the 
area of Preservation and Digitization 
Actions (field 583), an area that could 
be compared to elements in preserva- 
tion metadata schema. Use of this field 
is, as a matter of BL policy, normally 
reserved for use by conservation staff 
who use the dedicated Preservation and 
Conservation Management System, a 
separate instance of Aleph reserved 
for conservation work. Nevertheless, 
this field was used throughout this 
project so that the existence of a digital 
copy could be readily ascertained. This 
field is suppressed from public view by 



specifying a particular indicator value, 
which the OPAC has been configured 
to take into account. 

User Needs: Electronic Searching 

Most of the relevant fields in the 
bibliographic record were already 
indexed and visible, and so search- 
able and viewable in the ILS con- 
figuration. As part of the work to 
compile the "British Library Policy 
for Copy-Specific Information," all 
relevant fields in the bibliographic 
and holdings record were reviewed, 
and relevant changes were made to 
the ILS to ensure that they were 
indexed and visible within the staff 
view. 22 This guaranteed searching and 
access to the records across the staff 
view. For display in the OPAC, fields 
considered sensitive or unnecessary 
for the public to view (e.g., 583) were 
suppressed from public display as 
described above. Interaction with the 
BLs requesting system was integrated 
in that, just as the parent book can be 
requested directly using the standard 
requesting function in the OPAC, 
the analytic may be requested in the 
same way. As previously stated, this 
is because the analytic also contains 
the location details of the parent book 
because of the functionality of the 
LKR field. 

The needs of users searching the 
OPAC differed little from the original, 
internal-only audience. To staff and to 
the public educated in the structure 
and elements contained in library cata- 
logs, searching by title, author, or sub- 
ject are the expected retrieval methods 
for published materials. Within the 
OPAC, searches for records in the 
project may be limited by searching 
only within the Digital Items subset 
of the overall collection, or by search- 
ing for the Local Subject Heading, 
Scanned Maps, and Views. Outside 
of the OPAC, the new presence of 
these materials is highlighted within 
BL Help for Researchers webpages, 
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Figure 4. Screenshot of ILS Staff Interface, with the Digital Image Displayed alongside 
the Record 



offering guidance on searching the ILS 
for maps, so it was not considered nec- 
essary to create additional webpages. 

User Needs: Access to Images 

The project team wanted to pro- 
vide precise and quick retrieval of 
the images through a bibliographic 
search. The image and its metadata 
are linked to the bibliographic record. 
Immediate access to images serves 
several functions: 

• It enables project staff to ver- 
ify the ILS record and image 
authenticity as well as check the 
correct file naming assignment 
using a single system. 

• It provides a visual finding 
aid (or a digital surrogate, 
depending on what is being 
investigated) to assist users in 
determining whether the mate- 
rial is of sufficient interest to 
warrant requesting the original 
volume. 

• The storage of the image in the 
OPAC, with the bibliographic 
record and the requesting sys- 
tem, provides increased access 
in an immediate and familiar 
interface while sparing the frag- 
ile materials from unnecessary 
handling. 

• The ADAM module allows cap- 
ture of further technical meta- 
data for rights management and 
access control with the image. 

The ADAM module, which 
enables image files to be managed, 
delivered, and discovered, allowed 
"access images" (i.e., cropped, low- 
resolution JPEG images) to be attached 
to the analytic bibliographic records. 
Though the module is not yet widely 
used in the BL, permission was grant- 
ed by the ILS Service Management 
Group to the project team for this ini- 
tiative. A low-resolution access image 
of approximately 100-250 kb was cre- 
ated at the time of image capture for 
storage in ADAM. Like the TIFF 



master file to be stored in the library's 
digital archive, it is named accord- 
ing to the maps' unique identifier, 
the ILS system number of the bib- 
liographic record. These images are 
added manually to each of the records, 
thereby going through another process 
of ensuring the number, metadata, 
and record match. Figure 4 presents 
the object opened alongside the bib- 
liographic record in the staff interface 
of the ILS. 

In preparation for the records and 
attached images to be made visible in 
the OPAC, a batch service was run by 
the ILS team to create thumbnails for 
all of the objects. A screen shot of the 
appearance of records in the OPAC, 
with the thumbnail alongside, may 
be seen in figure 5. At the bottom of 
each record is an icon that links to 
the access-sized image of the file in a 
separate window. 

The use of ADAM meant that 
some of the MARC fields that have 
become standard at other institutions 
for cataloging digital images and elec- 
tronic reproductions were not used. In 
such cases, when an item is available 



on the Web, the MARC 856 field 
(Electronic Location and Access for 
"information needed to locate and 
access an electronic resource") will 
contain the URL with an active hyper- 
link to the raster image. In the case 
of a reproduction of a print item, the 
record may include a second 007 field 
in the bibliographic record to describe 
the digital reproduction. Because the 
location of the file was not being pro- 
vided, and the 006 field was applied to 
designate the secondary digital form, 
neither of these fields was used. The 
first template began with a MARC 530 
(Additional Physical Form Available) 
note, but this was eschewed by the 
time of the second mutation because 
it repeated data already present in the 
record. 

This tool is successful in managing 
the digital images for internal project 
purposes; for users, it provides an 
irreplaceable visual aid for an essen- 
tially graphic format that is difficult 
to visualize on the basis of the textual 
information supplied in a bibliograph- 
ic record. Clicking on the thumbnail 
image produces a pop-up window 
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Figure 5. A User's View of the Record with a Thumbnail of the Attached Image in the 
OPAC (Cropped Image Represents a 1585 Map of Virginia within a 1590 Book) 



with the enlarged access image, as 
in figure 6. Unlike most projects that 
make images available through a Web 
browser, however, there is no inter- 
active functionality with the images. 
This could be a serious disadvantage 
if a user is interested in conduct- 
ing research solely within the OPAC 
rather than using the images as a 
finding aid to consider if the original 
is relevant or worth consulting. The 
smallest typeface and other details on 
the map may not be sufficiently leg- 
ible in the access images. 

Along with item and holdings 
records, the digital object record cre- 
ated using ADAM forms part of the 
array of administrative data linked 
to the bibliographic records for the 
image and its host item. The ADAM 
record can contain metadata on copy- 
right, access permissions to view the 
image, and technical details of the 
image. These are not as detailed or 
as sophisticated as they could be in 



METS, but they served the purpose of 
this project. 

The Effect of Changing Standards 

Using established international cat- 
aloging rules and format standards 
throughout the library ensures flexibil- 
ity with changing technology, leaving 
open the possibilities for alternate sys- 
tems, expanded functionality, and sec- 
ondary uses. Amid the rapid changes 
taking place in library systems, image 
management, and metadata standards, 
this is particularly important for small- 
scale project work, which can easily be 
left behind by larger changes. In the 
future, the BL will presumably move 
to an XML-based schema to repre- 
sent descriptive metadata (though it 
is too early to speculate exactly what 
form this will take). The general trend 
in libraries is to move away from tra- 
ditional OPACs and to replace these 
with Web interfaces that offer more 



sophisticated and configurable dis- 
play options and more user-controlled 
activities, such as tagging. An XML- 
based format is required for effective 
integration of bibliographic data in 
Web interfaces. Although MARC and 
the current data structures in the BL 
ILS are satisfactory, any move to an 
XML-based format will provide an 
opportunity to look at the data afresh 
with a view to improve its structure 
and display. 

The BL is already planning 
to move to RDA, the successor to 
AACR2, in 2010. RDA is based 
on FRBR principles. In traditional 
cataloging, bibliographic units are 
described out of context; in FRBR, 
items must be described in context 
in a manner sufficient to relate the 
item to the other items making up the 
work. In the fullest implementation of 
RDA (Scenario 1), data is stored in a 
relational or object-oriented database 
structure that mirrors the FRBR and 
FRAD conceptual models. This more 
effectively supports what this project 
attempted to achieve — expressing the 
relationship between individual digi- 
tized images and their host work, docu- 
menting information about the image 
and making it accessible. The BL will 
initially implement Scenario 2 of RDA 
(currendy scheduled for the first half 
of 2010), in which bibliographic and 
authority files are linked and which the 
BL ILS supports. This means that the 
introduction of RDA into the BL will 
not have any effect on this project. 

Discussion 

Several larger issues emerged during 
the time span of the project, which, 
rather than being discussed in detail 
here, will be noted as ongoing con- 
cerns. 

Library Security 

The issue of security arose as a 
result of the proposal to make the 
information available to the public. 
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Curators questioned whether expo- 
sure (i.e., making users aware of the 
existence of library materials) would 
make those items more vulnerable 
to theft. This attitude is an enduring 
one and counters widespread library 
practices such as cataloging, digitiza- 
tion, and creation of indexes, research 
guides, and finding aids. There has 
been extended discussion among pro- 
fessionals on library security of maps, a 
topic beyond the scope of this paper. 23 
It is generally felt that a descrip- 
tive catalog record, especially one that 
includes copy-specific information, is 
a record of ownership and serves to 
protect materials. 

The Role of the OPAC 

Digitized images and associated 
metadata are often presented through 
separate, dedicated project websites, 
even when prepared by libraries with 
an OPAC. This may be because of 
limitations of library system technol- 
ogy or metadata structures in the past. 
Alternately, it could be attributed to 
the inevitable progression of research 
methods, user expectations, and infor- 
mation access in society, which can 
make OPACs, AACR2, and MARC 21 
seem inadequate and obsolete. 24 

Responsibilities within the Institution 

Though this paper only discusses 
a single element of the project (i.e., 
metadata capture), the project team 
was relatively successful in bring- 
ing together various departmental 
interests and input. This brought out 
questions, however, as to who should 
be doing what and for whom within 
the institution, given the number of 
tasks that were new and not neatly 
designated in job descriptions or by 
precedents. Frequently, libraries have 
dedicated staff for digitization, but 
even in the best cases the institutional 
infrastructure is relied upon to make it 
functional. This raises the question as 
to whether many librarians will move 
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Figure 6. An Enlarged Access Image Generated by Clicking on the Thumbnail in the 
OPAC 



from project to project, or if project 
tasks will be written into jobs. 

Conclusion 

This small digitization project at the 
BL was an opportunity to test how 
current cataloging codes and format 
standards can accommodate meta- 
data and image capture within the 
ILS. The project successfully fulfilled 
the collection security needs of the 
organization while demonstrating that 
this approach can offer an improved 
product, thereby increasing the access 
and visibility of collection items to 
better meet the needs of research- 
ers and providing the organization 
with data whose authenticity can be 
preserved and used in future systems. 
As opposed to purchasing a dedi- 
cated digital collection software suite 
or developing new websites that may 
or may not be found by library users, 
the collection items are integrated 
into the library catalog. The image, 
together with complete bibliographic 



information about both the map as an 
independent resource and its source 
book volume, may be retrieved with 
other library holdings in the OPAC. 

The use of library system tech- 
nology for creating and organizing 
metadata and making it searchable by 
users, and the MARC format with its 
flexibility and ability to handle differ- 
ing levels of granularity and formats, 
is a powerful combination for han- 
dling digitized objects. The combi- 
nation of established standards such 
as MARC21 and AACR2 with the 
ILS, which operates in a similar way 
to many other ILSs, means that the 
approach described in this paper can 
be propagated to any library that uses 
these standards and has a comparable 
ILS to describe formerly "hidden" col- 
lection items. 
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Appendix. List of the Fields in the Template for the Analytic Bibliographic Record for the Map 



FMT (The Map format was selected 
as the format in all cases, as it cov- 
ers both maps and views.) 

Leader (Type of record identified as 
Cartographic Material) 

001 — Record Control Number (The 
unique identifier for the record) 

006 — Fixed-Length Data Elements- 
Additional Material (A data ele- 
ment in this field indicating the 
materials' form denotes that the 
paper item cataloged has also 
been captured as a digital repro- 
duction. This was used in prefer- 
ence to a second 007.) 

007 — Physical Description Fixed 
Field-General Information (A data 
element in this field, Category of 
material, indicates that the item 
is a map.) 

008 — Fixed-Length Data Elements 
(The following areas are used: 
date, place of publication, lan- 
guage, and type of cartographic 
material. For the latter, all are 
"g" to identify the item as "map 
bound within another work.") 

034 — Coded Cartographic Mathe- 
matical Data 

040 — Cataloging Source 

100 — Main Entry-Personal Name 
(These headings are subject to 
authority control.) 

245— Title Statement (and 246) 

255 — Cartographic Mathematical 
Data (This cartographic materials- 
specific field, indicating scale, 
projection, and coordinates, was 
uniformly supplied as "scale not 
given." Scales were not generally 



supplied in a standard form on 
maps at the time, and most maps 
in the project that contained scale 
information expressed it in the 
form of a graphic scale or, in some 
cases, a verbal statement refer- 
ring to scales no longer used, e.g., 
chains. Deciphering and translat- 
ing either to a representative frac- 
tion in accordance with AACR2 
would have meant intensive labor 
producing only lukewarm results. 
Therefore the decision was made 
that the "scale not given" option 
for early cartographic materials 
would be applied.* This decision 
will be reviewed for phase two of 
the project, given the importance 
of scale. Geographic coordinates 
were not supplied for a similar 
reason. This too requires review, 
given the advantages of future 
potential display options to pres- 
ent materials in a geographical 
content. Both of these areas affect 
the contents of the correlated 
code field (034).) 

260 — Publication, Distribution, etc. 
(In most cases, this matched the 
date and place of publication 
for the book. In the case where 
there was a difference between 
the date printed on the map and 
that stated in the imprint of the 
book, the date on the map was 
recorded first, followed by the 
book imprint in brackets. A 500 
note was created in explanation.) 

300 — Physical Description Area (The 
extent of the cartographic item 



was in all cases named as either 
"map" or "view." Most maps were 
printed in black and white, with 
less than 1 percent in color. Also 
in this area are listed the dimen- 
sions, i.e., height x width of the 
map, the plate, and the sheet. 
Measurements were rounded to 
the nearest half centimeter.) 
510 — Citation/reference note to pub- 
lished bibliographic descriptions, 
reviews, abstracts, or indexes 
(These citations provided addi- 
tional information that could aid 
in deciphering the map, docu- 
menting the significance of the 
piece, and informing scholars 
that the map has been described 
extensively elsewhere. A maxi- 
mum of three recent citations 
per record were referenced, with 
priority given to those works in 
English.) 

583 — Action Note: Preservation & 
Digitization Actions (This note 
records information about pro- 
cessing, reference, and preserva- 
tion actions. The material specified 
was consistently "map.") 

690 — Local Subject (The records were 
united by the locally assigned 
"scanned maps and views.") 

651 — Subject Added Entry- 
Geographic Name (Although map 
records historically have used 
a dedicated subject system, the 
Map Library commenced using 
standard LCSH in 2004 with the 
move to Aleph. These headings 
are subject to authority control.) 



Elizabeth Mangan, ed, Cartographic Materials: A Manual of Interpretation for AACE2, 2002 Rev., 2005 Update, 2nd ed. (Chicago: 
ALA, 2006), Appendix G.2. 
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Notes on Operations 

Automated Metadata Harvesting: 
Low- Barrier MARC Record 
Generation from OAI-PMH 
Repository Stores Using MarcEdit 

By Terry Reese 



For libraries, the burgeoning corpus of born-digital data is becoming both a 
blessing and a curse. For patrons, these online resources represent the potential 
for extended access to materials, but for a library's technical services department 
they represent an ongoing challenge, forcing staff to look for ways to capture and 
make use of available metadata. This challenge is exacerbated for libraries that 
provide access to their own digital collections. While digital repository software 
like DSpace, Fedora, and CONTENTdm expose bibliographic metadata through 
the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), few 
organizations have a simplified method for harvesting and generating Machine- 
Readable Cataloging (MARC) records from these metadata stores. Fortunately, a 
number of tools have been developed that can facilitate the harvesting and gen- 
eration of MARC data from these OAI-PMH metadata repositories. This paper 
will examine resources that enhance technical services staff's ability to use existing 
metadata, with specific focus on one of these current generation tools, MarcEdit, 
which was developed by the author and provides a one-click harvesting process 
for generating MARC metadata from a variety of metadata formats. 

On December 11, 2007, Perry Willett, head of the Digital Library Production 
Service at the University of Michigan (UM) Library, posted a message to 
the XML4Lib electronic discussion list indicating that metadata for the public 
domain materials made available through the UM Library Google Books project 
were now available for Open Archives Initiative Protocol for Metadata Harvesting 
(OAI-PMH) download. 1 The OAI-PMH protocol was primarily developed as a 
low-barrier method for interoperability between metadata repositories. Using the 
protocol, structured bibliographic metadata can be shared between repositories 
and other metadata harvesters. The announcement was significant in two ways. 
First, it represents the first such announcement by a member of the Google 
Books collaboration. Second, the announcement underscores a growing trend in 
digital library development — widespread harvestability of a project's digital items 
and its metadata. Announcements such as these represent a boon for libraries and 
their patrons. As more collections move into the digital space, library patrons can- 
not help but benefit. However, for library technical services offices, announce- 
ments such as this can present new challenges. This paper considers options for 
handling these challenges by focusing on one tool, MarcEdit (http://oregonstate 
.edu/~reeset/marcedit). 

As more digital services like the UM Library Google Books project move 
their metadata into the public Web space, library technical services depart- 
ments need to determine how they will make use of this new influx of available 
metadata. For sure, some libraries have become accustomed to the many issues 
dealing with non-MABC metadata within what is still largely a MABC-centric 



122 Reese 



LRTS 53(2) 



universe. For libraries hosting digital 
collections or institutional repositories, 
challenges related to the representa- 
tion of those digital objects within 
a library's many discovery tools like 
the OCLC Online Computer Library 
Center's (OCLC) WorldCat or local 
integrated library system (ILS) are 
commonplace. While most digital col- 
lection software (for example, DSpace, 
Fedora, and CONTENTdm) and 
many vendor product solutions (like 
NewsBank's Congressional Serials Set) 
provide the ability to harvest item 
metadata by using OAI-PMH, few 
libraries use these metadata streams to 
generate MARC records. The process 
of downloading, converting, and man- 
aging metadata records beyond the 
traditional MARC metadata workflow 
remains largely unexplored in many 
libraries. For those that do repur- 
pose non-MARC metadata in some 
way, the process is often limited to a 
single service or metadata stream. For 
example, both Texas A&M University 
Libraries and the University of Virginia 
(UV) Library documented their efforts 
to develop site-specific metadata har- 
vesters for converting bibliographic 
metadata for electronic theses and 
dissertation records submitted to their 
institutional repositories into MARC. 2 
Non-MARC metadata models for 
sharing digital metadata are not likely 
to disappear, and technical services 
departments will need to adjust to new 
forms of metadata acquisition. During 
the past twenty years, OCLC and the 
Library of Congress (LC) have pro- 
vided libraries with a single, central- 
ized metadata repository from which 
to download bibliographic metadata. 
While OCLC remains the largest data- 
base of available bibliographic content, 
the actual distribution of metadata 
today is becoming much more decen- 
tralized. Institutional repositories and 
digital collection software have played 
a role in moving the library from meta- 
data consumers and creators to meta- 
data distributors. For libraries looking 
to leverage content housed in digital 



collections, understanding and devel- 
oping processes of harvesting and con- 
verting non-MARC metadata will be 
essential for moving forward. 

Together, the Open Archive 
Initiative (OAI, www.openarchives 
.org) and library communities have 
worked in recent years to provide 
a number of tools to facilitate the 
harvesting and conversion of OAI- 
PMH-compliant metadata into other 
delivery formats, both non-MARC 
and MARC. Traditionally, these tools 
have been released as parts of "kits" 
or components that library developers 
could use in specialized conversion 
tools. However, while these tools and 
kits have provided library information 
technology (IT) departments greater 
access to bibliographic metadata, they 
have done little to help technical ser- 
vices departments deal with OAI-PMH 
data. More recently, OCLC released an 
updated version of its Connexion soft- 
ware that provides limited capabilities 
for metadata harvesting of up to one 
hundred records through OAI-PMH; 
the software supports various flavors 
of Dublin Core (DC). This is a step 
in the right direction, but it provides 
no flexibility for customizing the data 
conversion itself, thus making record 
creation a one-size-fits-all process. 
The flexible nature of non-MARC 
metadata formats coupled with the 
lack of a formal standard for inputting 
metadata within non-MARC formats 
has made metadata creation somewhat 
uneven and not easily managed using a 
generic conversion process. The issue 
is well known in cataloging circles, as 
noted in an article found in Online 
Libraries and Microcomputers. 3 Here 
the author notes the many challeng- 
es one encounters when attempting 
to crosswalk metadata from one for- 
mat to another. The one-size-fits-all 
approach to metadata is problematic 
because of issues related to granu- 
larity and consistency. Crosswalking 
metadata from one level of granular- 
ity to another is always difficult. For 
example, when moving from a schema 



of high granularity like MARC to a 
less granular schema like DC, the loss 
of both bibliographic content as well 
as context is often unavoidable. For 
instance, MARC 21 has numerous 
fields to represent the "author" of an 
item with each field containing contex- 
tual information about that "author." 
In unqualified DC, this context and 
granularity is lost because all "authors" 
are placed into a single dc: author ele- 
ment. Likewise, metadata of lower 
granularity cannot easily be moved 
to schemas with higher granularity 
because context and content cannot 
be manufactured if it is not present 
within the original record. Second is 
the issue of consistency. Although all 
DSpace and CONTENTdm software 
platforms use DC as the method for 
primary markup, the best practices 
used when generating metadata vary 
widely, potentially varying between 
projects within a single institution. 
The lack of a national standard or 
shared best practices when creating 
non-MARC metadata has contributed 
to a high level of inconsistency in the 
metadata currently being produced. 
This inconsistency makes capturing 
subtle relationships expressed within 
the metadata difficult and can result 
in overly broad and only marginally 
useful MARC records generated using 
these generic translation processes. 

Seeing a need for a process that 
both flexibly and reliably converts 
metadata from OAI-PMH metadata 
stores into bibliographic formats usable 
by its online catalog, Oregon State 
University (OSU) Libraries chose to 
use MarcEdit, a freely available client 
application (developed by the author) 
that offers default conversion sup- 
port from OAI-PMH metadata to a 
number of different metadata formats. 
This paper will provide a brief discus- 
sion of MarcEdit 's metadata harvest- 
ing functionality as well as provide a 
detailed description of two potential 
use cases. The first example details 
how OSU Libraries catalogers use 
MarcEdit to harvest unqualified DC 
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metadata from electronic theses in its 
institutional repository and automati- 
cally generate MARC 21 records for 
inclusion into both the online cata- 
log and WorldCat. Through this con- 
version process, OSU Libraries has 
been able to avoid expensive effort 
duplication and, more importantly, has 
developed a simple workflow that can 
be used by technical services staff 
to capture OAI-PMH metadata from 
any OAI-PMH provider and generate 
records for the OSU Libraries cata- 
log or OCLC. The second example 
demonstrates how staff can generate 
MARC records from the UM Library 
Google Books metadata. The process 
will detail some of the problems that 
can be encountered while working 
with metadata from remote metadata 
repositories as well as ways of over- 
coming those challenges. 

Literature Review 

Given the pervasiveness of Extensible 
Markup Language (XML)-based 
metadata and the wide range of pro- 
tocols that support and advertise the 
presence of available metadata, it 
is surprising that automated meta- 
data harvesting, MARC record gen- 
eration, and library staff-centric tools 
development is not more frequently 
addressed in the literature. Several 
articles detail the process of indexing 
and harvesting MARC data into other 
indexing systems like Solr (http:// 
lucene.apache.org/solr). Likewise, 
tools like Villanova's VuFind (www 
.vufind.org) and UV Library's Project 
Blacklight (http://blacklight.betech 
.virginia.edu) have advanced discus- 
sions relating to MARC indexing out- 
side of a non-MARC environment. 
Only a few articles discuss processes 
for reusing XML-based metadata for- 
mats in MARC environments, and 
fewer still have been written specifi- 
cally for technical services staff. Most 
have concentrated on the potential 
for reusing existing metadata in one's 



institutional repository to generate 
MARC records for submitted elec- 
tronic theses and dissertations. 

Surratt and Hill's article on 
the development of a customized 
ETD2MARC processing documented 
how Texas A&M University Libraries 
was able to customize a process devel- 
oped by UV Library to provide a semi- 
automated record generation tool. 
Integrated into their workflow, the tool 
provided a way for staff to automati- 
cally generate MARC records for items 
as they were submitted into their insti- 
tutional repository. 4 The resulting files 
from the metadata translation were 
dirty, core-level MARC records, which 
were then reviewed and edited by a 
staff member and finally entered in the 
online catalog and sent to OCLC. Texas 
A&M University Libraries' conversion 
script allowed their catalogers to more 
efficiently process electronic theses 
and dissertations (ETDs) by making 
use of attached metadata. While the 
article provided a copy of the script 
used to perform the conversion pro- 
cess, little evidence suggests that other 
institutions were able to use the Texas 
A&M University Libraries' method 
to promote metadata repurposing at 
their own institutions. The reason lies 
in the implementation. The process 
documented by Surratt and Hill fulfills 
the needs of the organization but is 
so tightly coupled to the organiza- 
tion's workflow that it becomes unus- 
able without significant revision when 
taken outside of that environment. In 
addition, the process of data conver- 
sion was moved outside of technical 
services, meaning that a firewall was 
placed between the catalogers and the 
developers that created the script. 

An article by Kurth, Ruddy, and 
Rupp documents an ongoing metadata 
repurposing project at the Cornell 
University (CU) Library. Unlike the 
process documented by Surratt and 
Hill, the CU Library project looks at 
the development of a service to repur- 
pose MARC metadata for use within 
one's digital library infrastructure. 5 



Kurth, Ruddy, and Rupp note that 
metadata currently found within the 
online catalog could be used to enrich 
many of the digital services and proj- 
ects at CU Library. However, to use 
this metadata, a system needed to be 
developed that broke down MARC 
metadata and reassembled it for use in 
the Text Encoding Initiative (TEI) and 
DC. What makes this system inter- 
esting is the cooperative relationship 
between CU Library's metadata ser- 
vices and its IT department. While the 
article notes that the IT department 
develops and maintains the MARC 
processing scripts and document type 
definitions (DTD) for validation and 
creates the Extensible Stylesheet 
Language Transformations (XSLTs) 
used to crosswalk MARCXML data 
to TEI or DC, the collection-specific 
MARC mappings were created in con- 
junction with stakeholders from within 
the library. Since metadata conver- 
sions feed metadata directly to specific 
digital projects, the conversion must 
be completely automated. In this case, 
that is possible because of the con- 
trolled nature of the metadata and the 
granularity of the destination metadata 
schema. 

A 2005 article by this author in 
the Journal of Map and Geography 
Libraries described the process used 
by OSU Libraries to generate MARC 
records for Geographic Information 
Systems (GIS) datasets from the 
accompanying Federal Geographic 
Data Committee (FGDC) meta- 
data records. 6 Using MarcEdit, OSU 
Libraries was able to create a generic 
XSLT stylesheet that could be used 
as a template for translating FGDC 
metadata to MARC 21 XML. Once in 
MARC 21 XML, MarcEdit is able to 
translate the metadata into MARC 21 
as well as accommodate character set 
translations between the legacy MARC- 
8 and more current 8-bit Unicode 
Transformation Format (UTF-8). 
Because of the richness of data found 
within the FGDC data format, the 
MARC records generated from the 
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FGDC data sources often included 
much more detailed information than 
records generated without the FGDC 
metadata. Although this process is not 
fully automated because records are 
not harvested and translated automati- 
cally, the process is portable. 

First, MarcEdit uses a frame- 
work that allows metadata, once 
converted to MARC 21 XML, to be 
translated to any metadata format 
registered with the application. For 
Oregon State University, that meant 
that once the FGDC crosswalk was 
developed, catalogers could produce 
records in MARC, MARC 21 XML, 
DC, DC Qualified, Metadata Object 
Description Schema (MODS), and 
Encoded Archival Description (EAD) 
records from a single FGDC source. 
Second, the user of the application 
has full control over the crosswalk 
itself, meaning that the cataloger is 
free to modify the conversion rules. 
This allows the cataloger to control the 
conversion between metadata formats 
with a greater level of granularity. 

While the literature and toolsets 
for technical services focus on uses 
of existing XML-based metadata for- 
mats within existing MARC environ- 
ments, a great deal of literature exists 
outside technical services on harvest- 
ing and repurposing metadata for 
the development of external services 
and metasearch repositories. Articles 
like Simons and Bird's "Building and 
Open Language Archives Community 
on OAI foundation" or Suleman and 
Fox's "Leveraging OAI Harvesting 
to Disseminate Theses" look at the 
OAI-PMH standard and the role that 
it can and has played in setting up 
large, ad hoc document communi- 
ties. 7 Several data aggregations such as 
UM's OAIster (www.oaister.org) proj- 
ect, which provides a single point of 
query for more than 19 million records 
(as of December 2008), or Emory 
University's AmericanSouth.org proj- 
ect (now ceased), which focused on 
the aggregation of cultural and histori- 
cal content, have been based on the 



concept of harvesting available meta- 
data and repurposing it to draw connec- 
tions and build virtual collections and 
local aggregations. 8 Out of these proj- 
ects have come tools and frameworks 
that can be used to build additional 
metadata aggregations and services. 
The Metadata Migrator (www.meta 
scholar.org/sw/mm), a self-contained 
application designed as a crosswalk 
for and generator of DC data files and 
that can be served as part of an OAI- 
PMH repository, is one such resource 
to come out of the MetaScholar initia- 
tives (www.metascholar.org), a digital 
library project at Emory University. 
Many exemplary projects like Picture 
Australia (www.pictureaustralia.org) 
and the Networked Digital Library of 
Theses and Dissertations (www.ndltd 
.org) have been developed through the 
aggregation and harvest of OAI-PMH 
metadata, demonstrating the availabil- 
ity of metadata for many of the digital 
items currently being generated by 
researchers and universities around 
the world. Libraries and their techni- 
cal services departments could take a 
cue from these projects as they look 
to collect and provide access to digital 
resources through their organization's 
primary discovery tools, which are 
often still online catalogs. 

For libraries looking to use and 
expose their OAI-PMH-based meta- 
data products, making the technical 
information about these resources 
available to the larger library commu- 
nity will continue to be a growing chal- 
lenge. Metadata providers will need to 
consider how discovery takes place not 
just for items within their collections 
but also for the digital services that 
expose those collection. For this rea- 
son, projects such as OCLC's Digital 
Registry (www.oclc.org/registry), the 
Ockham Initiative (www.ockham.org), 
and the Joint Information Systems 
Committee Information Environment 
Registry (http://iesr.ac.uk) have worked 
to develop a flexible registry system 
for the sharing of technical metadata 
about digital collections. For technical 



services departments interested in reus- 
ing existing metadata for digital items, 
simply finding the information needed 
to access and capture that metada- 
ta may be a significant barrier that 
will continue to exist inthe immediate 
future. Fortunately, a number of open 
OAI metadata repositories are being 
developed to fill this need. In their 
article, "Current Developments and 
Future Trends for the OAI Protocol 
for Metadata Harvesting," Shreeves 
and colleagues made note of a num- 
ber of OAI repositories being devel- 
oped into a comprehensive knowledge 
base capable of providing the techni- 
cal information users need to harvest 
metadata. 9 As they observed, metadata 
repository development represents the 
likely future for the OAI-PMH commu- 
nity as data harvesters look for reliable 
ways to retrieve technical information 
about a given metadata community 
and to discover other communities and 
projects that may be related. 

Available Tool Sets 

Presently, a number of effective devel- 
opment toolsets and software kits exist 
to provide OAI functionality to library 
tools. Developers interested in working 
with the OAI-PMH protocol are able 
to choose from components developed 
in a variety of languages, such as the 
Perl OAI modules, the Ruby OAI 
gem, or one of the many Java OAI 
harvesting kits; components have been 
readily available for some time for 
developers looking to build resources 
to aggregate metadata together. The 
OAI keeps track of a number of user- 
contributed tools and toolkits. 10 

The primary purpose of this paper 
is to look at resources that enhance 
technical services staff's ability to take 
advantage of existing metadata, not 
to examine resources developed for 
the developer community. While a 
rich ecosystem of developer-related 
tools exists for processing OAI-PMH 
metadata, these tools provide very 
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little practical benefit to most techni- 
cal services staffs and departments. To 
address this absence, this paper high- 
lights two main classes of metadata 
harvesting tools currently available for 
technical services staff who wish to 
work with non-MARC metadata. 

Innovative Interfaces' XML 
Harvester 

Presently, many vendors have 
or are developing tools to facilitate 
the harvest of non-MARC metada- 
ta into the online catalog. ILS ven- 
dors like Innovative Interfaces have 
moved to create systems to stream- 
line metadata harvesting directly into 
the online catalog. The Innovative 
Interfaces metadata solution known as 
XML Harvester, developed in coop- 
eration with Michigan State University 
(MSU), is representative of most ILS 
vendor-supplied data harvesting tools 
because it provides one-way meta- 
data conversion from a single data 
source into the online catalog. XML 
Harvester was used initially by MSU 
to generate MARC records in the 
online catalog from harvested EAD 
metadata, although today it can pro- 
vide conversions from a number of 
different metadata formats. 

XML Harvester's functionality is 
representative of most ILS vendor- 
supplied metadata harvesting applica- 
tions. Since this class of applications 
tends to run at the server level, control 
over how metadata crosswalking is 
defined will vary in granularity and 
generally be available only to IT staff 
or those at the system level. Likewise, 
this class of tools tends to be designed 
to be single project solutions, mean- 
ing that a significant amount of time 
is generally required for set up and 
testing to harvest a single collection, 
overhead that must be reallocated 
each time a new collection is set to 
be harvested. Because translations are 
tailored to specific projects or collec- 
tions, work done for one project can- 
not be shared or used when looking 



to harvest other col- 
lections. This places 
practical limits on the 
types of projects that 
these tools can sup- 
port. While the tight 
coupling with the ILS 
generally simplifies 
the process of load- 
ing and updating har- 
vested metadata, it 
does come at a price. 
XML Harvester, for 
example, can only be 
used to harvest meta- 
data into the online 
catalog and Encore 
Platform rather than as an abstract 
harvesting tool for providing metadata 
conversion services. This does tend 
to put very specific limits as to how 
useful this class of tools can be in gen- 
eral, particularly when considering the 
wide range of databases and services 
library technical service departments 
are being asked to maintain. The abil- 
ity to harvest metadata and convert it 
into many different formats will likely 
become more important with time, 
possibly shortening the shelf life for 
this class of applications. 

OCLC's Connexion 

Some in the vendor community are 
beginning to provide better support 
for non-MARC metadata formats. 
For catalogers, the most interesting 
recent development for this is the 
inclusion of a metadata harvesting and 
crosswalking tool directly into OCLC's 
Connexion product (www.oclc.org/ 
connexion). Given OCLC's influence 
and large number of member libraries, 
its software has the potential to sim- 
plify the metadata harvesting process 
for numerous libraries as well as lower 
the barriers to getting digital object 
metadata into the WorldCat database. 
As of version 2.10, the Connexion cli- 
ent provides OCLC members a set of 
basic metadata harvesting functional- 
ities able to process records in a variety 




Figure 1. OCLC Connexion Metadata Extraction Tool 



of DC flavors. The software is unique 
in the vendor community because it 
represents one of the first attempts 
by a vendor to shift responsibility for 
metadata harvesting and reuse from 
a library's IT staff to its technical ser- 
vices staff. Nevertheless, the current 
implementation offers little practical 
functionality. 

Figure 1 illustrates the current 
functionality provided to the user. 
Presently, users wanting to automati- 
cally generate MARC records from 
OAI DC records must download the 
record set locally before initiating this 
process. For large datasets, like UM 
Library's Digital Books project, this 
workflow would be unfeasible because 
each OAI request returns only five 
hundred items. For example, using 
this method to generate the nearly 
one hundred thousand records made 
available through UM Library's Digital 
Books project would require harvest- 
ing the dataset two thousand times. 

One of the most unique aspects 
of the OCLC's approach has been 
the decision, at least initially, to hide 
the metadata conversion process from 
the user. While this simplifies the 
overall metadata conversion process, 
it introduces a "fast food" approach 
to metadata conversion and is the 
process's greatest weakness. Given the 
number of ways that DC elements 
can be interpreted and implemented 
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Figure 2. MarcEdit Spoke-and-Wheel Design 



between collections within a single 
organization, such a one-size-fits-all 
approach to metadata extraction and 
generation is not likely to be useful for 
meaningful record generation. Despite 
its limitations, the OCLC Connexion 
metadata extraction tool is still a sig- 
nificant step toward mainstreaming 
metadata crosswalking for technical 
services staff. 

MarcEdit 

Overall, the vendor community has 
been making great strides toward sim- 
plifying die process of working with 
harvested metadata sets. However, at 
this point, their efforts still remain 
very project-based, making them mar- 
ginally useful as general metadata con- 
version tools for the diverse datasets 
available to the library community. 
Each serves a need, but, as general 
metadata harvesting and conversion 
tools, their inability to allow catalog- 
ers to control metadata harvesting and 
customize the conversion rules is a 
serious impediment to adoption. For 
these reasons, OSU Libraries has used 
MarcEdit for its data harvesting and 
conversion needs. 

MarcEdit is a freely avail- 
able, metadata editing suite initially 



conceived in 1998 as a graphical user 
interface (GUI) replacement for the 
LC's DOS-based MARCBreakr and 
MARCMakr software. Originally 
designed primarily as a batch MARC 
editing tool, the program expanded the 
functionality found in MARCBreakr 
and MARCMakr by including the 
MarcEditor, a notepad designed spe- 
cifically for the modification of batch 
MARC records. Metadata needs and 
formats have changed significantly 
since 1998, and MarcEdit has changed 
with them. Today, the name MarcEdit 
is almost a misnomer because the 
application no longer is simply a batch 
MARC editing tool. Instead, MarcEdit 
is an application suite of metadata 
editing tools, including character set 
conversion, XML crosswalking, and 
metadata harvesting. 

In many respects, MarcEdit has 
a number of things in common with 
OCLC's Connexion application. They 
are both client-side applications, 
empowering users to work with data 
from many different sources. Likewise, 
the applications work with die OAI- 
PMH protocol and provide built-in 
data conversion rules for supported 
metadata formats. However, MarcEdit 
takes this one step further by provid- 
ing users with the ability to customize 



the existing data conversion rules or 
create new data conversion rules. This 
allows users to harvest metadata from 
one of the supported metadata formats 
(DC, MODs, OAI MARC, or MARC 
21 XML) as well as create conversion 
templates for additional metadata for- 
mats. It also allows users to customize 
existing conversion templates to reflect 
many variations in best practices used 
between projects. Users are given this 
customizability through XSLT All of 
MarcEdit 's metadata conversion rules 
are defined as XSLT templates. 

Appendix A presents the entire 
XSLT stylesheet used for converting 
OAI MARC records to MARC 21 
XML. This is a good example because 
it underlines how readily available 
this type of crosswalking information 
already has become. This particular 
stylesheet was derived from an XSLT 
stylesheet provided by the LC and is 
one of many such examples currently 
available to the library community. 11 
Why a conversion to MARC 21 XML? 
MarcEdit uses a "wheel-and-spoke" 
method, with MARC 21 XML sitting 
at the center of drat wheel. This archi- 
tecture allows metadata conversions to 
be created without the need to know 
directly how the individual metadata 
elements relate to elements within dif- 
ferent schemas. Once a new spoke has 
been added to the wheel, it becomes 
crosswalkable to any other spoke on 
that wheel. 

Figure 2 provides an illustration 
of this approach. Using this model, an 
EAD record could be translated to any 
odier metadata schema on die wheel 
widiout the need to know how the 
elements in the EAD record relate to 
elements in die destination format. A 
user simply needs to modify or create 
a new XSLT template to modify the 
formats and behaviors of MarcEdit's 
metadata conversion process. At one 
time, finding technical services staff 
widi the ability to modify or create 
an XSLT document may have been 
an impediment, but the ubiquitous 
nature of XSLT has made this skill 
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Figure 3. MarcEdit Welcome Screen 



set much more common within the 
library community. At OSU Libraries, 
this functionality is what that makes 
MarcEdit's method for metadata har- 
vesting so valuable. Given the dearth of 
metadata currently available in DC, the 
ability to customize metadata conver- 
sion rules is essential to accommodating 
the variety of best practices and input 
standards. Staff members also have the 
option of either accepting the template- 
generated metadata or continuing to 
be active participants in the metadata 
creation process. 

Harvesting from OAI 

Like Connexion, MarcEdit simpli- 
fies the process of harvesting OAI- 
PMH-based metadata. Upon startup, 
users are greeted with the MarcEdit 
welcome screen (see figure 3), which 
includes links to commonly used func- 
tionality. Here one will find a link 
to MarcEdit's OAI Data Harvester, 
which initiates the data harvesting ser- 
vice (see figure 4). 

Once initialized, the user needs to 
provide the Metadata Harvester with 
only the host (URL) and set (collec- 
tion name) for the set of records to be 
harvested. Users can optionally change 
the metadata type being requested 
from the server as well as define 
their own set of translation rules. 
Once set, the MarcEdit Metadata 
Harvester captures and translates the 



set's metadata records 
from the defined meta- 
data type to MARC. No 
interaction is required 
by the user. Users who 
wish to do more gran- 
ular data harvests can 
select the advanced set- 
tings link to use some of 
the optional parameters 
supported within the 
OAI-PMH specifica- 
tion. The advanced set- 
tings function reveals 
a cache of additional 
options that can be set 
to define what records are to be har- 
vested by the Metadata Harvester 
(see figure 5). 

Using the advanced settings, users 
have the ability to define a subset of 
records (using Start and End), indi- 
vidual records (using GetRecord) or 
resume harvesting a predefined record 
set (using the ResumptionToken). 
Additionally, the harvester can translate 
record data from Unicode to MARC-8 
as well as simply harvest and save the 
raw XML metadata files to a local 
file system. The character conversion 
options should be of special value for 
libraries that still use systems that can- 
not load or recognize MARC records 
encoded in UTF-8. Functionality has 
been added for users wanting to har- 
vest XML-based metadata and cre- 
ate records using the legacy MARC-8 
character set. Again, users need not set 
any of these options to harvest OAI- 
PMH metadata, but they are available 
for more granular data capture. 

Case Study: OSU Libraries 
Electronic Theses and 
Dissertation Record 
Generation 

Getting Started 

In January 2007, OSU joined a growing 
fraternity of universities whose students 
must submit electronic copies of their 



theses or dissertations in order to grad- 
uate. This policy shift by the graduate 
school was met with great excitement 
by OSU Libraries, which would take 
on the role of preserving and providing 
access for these materials through the 
library's institutional repository (IR) 
portal, ScholarsArchive@OSU. Within 
the IR, these materials could find a 
larger audience both inside and out- 
side the university, potentially extend- 
ing the reach of the research being 
done by the university. 

With these changes came a num- 
ber of challenges for OSU Libraries' 
technical services department. Like 
most institutions, OSU Libraries had 
traditionally created original MARC 
records for OSU theses, adding the 
MARC records to the local ILS as well 
as the WorldCat database. Cataloging 
for these records was done as materials 
were submitted to OSU Libraries by 
the graduate school; technical services 
staff usually received all of a term's the- 
ses at one time. All record creation was 
performed using Connexion, meaning 
records were created once, dynami- 
cally becoming part of WorldCat, and 
then downloaded directly to the local 
library catalog. 

The submissions of the theses 
and dissertations in electronic format, 
however, would be a much differ- 
ent process. First, unlike traditional 
print documents, electronic theses 
and dissertations would be submit- 
ted into the IR at any point during 
the term. Materials would first be 
vetted by the graduate school, then 
released to OSU Libraries, where they 
would be evaluated and then be made 
public. Technical services staff could 
no longer allocate fixed processing 
time for handling theses and disserta- 
tions because materials now would 
not be submitted on a fixed schedule. 
Secondly, metadata creation for these 
documents would shift from technical 
services staff to the document cre- 
ators. When documents are submitted 
into the IR, submitters are required to 
provide metadata including abstracts 
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and keywords. As a result, technical 
services staff began to use the metada- 
ta stored within the IR as the primary 
bibliographic data of record, mean- 
ing that metadata creation no longer 
took place in Connexion and was no 
longer being generated in MARC. 
This left technical services with two 
options: catalog materials twice (once 
in the IR and once in Connexion) 
or design a process that would allow 
metadata entered into the IR to be 
loaded directly into both WorldCat 
and the local catalog. Ultimately, the 
library choose the second option, not 
only to avoid the expense of rekeying 
records but also to support efforts to 
design processes that repurpose meta- 
data rather than rekeying. As a side 
benefit, by using the metadata entered 
into the IR, the library was able to 
take advantage of an entirely new set 
of metadata elements: user contrib- 
uted keywords and descriptions of the 
document. For materials like theses 
or dissertations, this information can 
be invaluable given the timeliness of 
topics, many of which are yet to be 
represented well within existing con- 
trol vocabularies like the LC Subject 
Headings (LCSH). 

Submission Process 

The submission process for OSU 
Libraries' ETD program mirrors those 
used by a number of other institutions 
using DSpace as their IR platform. 
Materials are submitted directly into 
the IR by their authors; in this case, 
it is the graduate or PhD candidate. 
Each submitter answers a number 
of questions through the submission 
process, entering information about 
their paper and topic. This metadata 
forms the foundation for the MARC 
records creation later in the process. 
The submitter then chooses a distribu- 
tion license and uploads the document 
to the IR. 

Once the item has been submitted 
to the IR, it is then vetted by the uni- 
versity's graduate school. This process 
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ensures that only 
approved materials 
are actually archived 
in the IR. Once 
approved, the item is 
then forwarded into 
the IR's work queue, 
which is managed by 
technical services. At 
this point, a library 
staff member does 
original cataloging for 
the item in the IR. He 
or she then validates 
the user-submitted 
metadata and enters 
LCSH subject terms 
and any necessary 
descriptive notes for 
the document. When 
finished, the material 
is published to the IR 
and made available 
to the larger research 
community. 

The question 
of whether MARC records are still 
needed was one with which OSU 
Libraries has struggled for some time. 
While having the records within both 
the local ILS as well as the WorldCat 
database was ideal, the reality was that 
staff simply did not have time to rekey 
data to create the MARC records. 
Moreover, given the increased accessi- 
bility of these documents through Web 
browsers like Google, questions arose 
regarding the need to continue pro- 
ducing MARC records. In the end, the 
library decided that having the data in 
WorldCat was important and set out to 
build a workflow that would keep staff 
from having to rekey the metadata 
from the IR into Connexion. 

Automatic MARC Record 
Generation 

MarcEdit offered a solution to the 
rekeying issue. As a DSpace reposi- 
tory, the library's IR could provide 
metadata for new and modified 
records entered into the IR through 
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Figure 4. MarcEdit OAI-PMH Metadata Harvester 
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Figure 5. Advanced Settings Function in MarcEdit Metadata 
Harvester 



OAI-PMH. Unfortunately, the clas- 
sified staff ultimately responsible for 
creating MARC records for IR items 
had no experience working with OAI- 
PMH or repurposing metadata from 
one schema to another. The ideal tool 
would automatically generate records 
from the metadata while essentially 
hiding this interaction from staff. 

To meet this need, a special XSLT 
crosswalk derived from the default 
template was created to translate the 
DC metadata used for ETDs to their 
equivalent MARC fields. 12 This cross- 
walk varied from the vanilla DC-to- 
MARC 21 XML crosswalks provided 
by the LC because it used position- 
ing within the metadata record to 
determine the context of some of the 
record's metadata, since all metadata 
being harvested through OAI-PMH 
was unqualified DC. Using informa- 
tion about the generated metada- 
ta, an XSLT stylesheet was created 
to restore context to the harvested 
metadata. Once the context for these 
metadata elements was reestablished, 
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the ability to create good MARC 21 
records became much easier. 

After being created, this custom 
XSLT transformation was registered 
with the MarcEdit application, allow- 
ing staff to use a simple, wizard-based 
application to harvest OAI-PMH 
records and convert them directly into 
MARC. This process has allowed staff 
to develop a workflow that allows 
them to immediately process items 
when submitted to the IR while doing 
MARC record generation for those 
items at the end of each week. The 
records are reviewed for accuracy and 
then uploaded directly to WorldCat 
and the local ILS. Appendix B pro- 
vides an example of records currently 
generated with the OSU Libraries 
record-generation process. The exam- 
ple demonstrates how Unicode charac- 
ters outside of the MARC-8 character 
set are embedded into the document 
as well as how subjects and headings 
are analyzed to create records that 
will require little or minimal manual 
intervention during the later steps of 
the process. This process has become 
more and more automatic as time has 
progressed, and the XSLT stylesheet 
has been refined to allow for minimal 
review prior to uploading metadata to 
the local catalog and WorldCat. 

Problems and Solutions 

During the early testing phases of this 
process, several potential problems 
needed to be resolved. The first and 
most important issue was related to 
character encodings. All data loaded 
into the IR and harvested through OAI- 
PMH was encoded in UTF-8. While 
certainly desirable, OSU Libraries' 
local ILS was not set up to recog- 
nize Unicode data in MARC records. 
Likewise, presently, the OCLC's own 
importing tools will not allow for the 
import of Unicode data within MARC 
records. This meant that whatever 
harvesting tool OSU Libraries used to 
generate MARC records had to be able 
to facilitate some form of character 



set remapping between Unicode and 
MARC-8. Fortunately, MarcEdit's 
OAI Harvester includes the ability 
to remap metadata from Unicode to 
MARC-8 on the fly, quickly solving 
this issue for the library. 

The harvesting of granular meta- 
data using unqualified DC remains an 
issue today. While much of the context 
lost because of the generic nature 
of unqualified DC can be reclaimed 
through careful analysis of the metada- 
ta, one ambiguity that cannot be easily 
resolved is the differentiation between 
staff-submitted LCSH subject terms 
and user-submitted keywords. While 
the XSLT stylesheet can be coded to 
make a very good educated guess as 
to the nature of these elements, the 
ambiguity persists enough that review 
is required following record genera- 
tion. Fortunately, a solution to this 
issue has been created by the DSpace 
community. The recently released 
DSpace 1.5 simplifies the inclusion of 
additional supported metadata formats 
for harvest through OAI-PMH. This 
should allow OSU Libraries to modify 
the XSLT transformation so that it 
uses a more granular XML schema 
for harvesting, such as qualified DC 
or MODs and so that it retains the 
context associated with each harvested 
element producing production-ready 
MARC records from the harvest. 

Use Case: Automatic MARC 
21 Record Generation for 
Remote Resources 

Once established, the ability to harvest 
and repurpose metadata can funda- 
mentally change how librarians collect 
materials. By lowering the barriers 
for creating MARC or, alternatively, 
other forms of records for addition 
to an institution's discovery applica- 
tion, technical service departments 
can empower collection development 
staff to evaluate a wider range of the 
electronically available materials being 
produced by research institutions. 



Likewise, technical services depart- 
ments can represent documents within 
their local ILS that would have previ- 
ously been considered out of reach. 
Examples of this at OSU Libraries are 
numerous, ranging from the capture 
of documents from a sister institu- 
tion's IR to automated harvesting and 
record generation of tens of thousands 
of digital documents stored within 
CONTENTdm. Outside of OSU 
Libraries, libraries are starting to con- 
sider how they can leverage OAI-PMH 
metadata made available from vendors 
like NewsBank or how they can cap- 
ture and represent tens of thousands 
of records from free metadata reposi- 
tories like Project Gutenburg (www. 
gutenberg.org) or the LC's American 
Memory Project (http://memory.loc. 
gov/ammem). 

One specific remote metadata 
set currently of interest to a growing 
number of institutions is UM Library's 
digital collections, or, more specifi- 
cally, metadata from its Hathi Trust 
Digital Library (formerly MBooks) 
project (www.lib.umich.edu/mdp) for 
materials currently being scanned 
through Google's Book Scanning proj- 
ect. In December 2007, UM Library 
announced that it would be making 
available OAI-PMH harvestable meta- 
data records for all the public domain 
materials captured through the proj- 
ect. 13 For the library community, this 
decision was significant because it was 
the first to come from any of the 
institutions partnering with Google. 
Not surprisingly, many libraries have 
started looking at how this content can 
be captured and loaded into their ILS 
systems. 

While many libraries likely would 
have preferred that UM Library sim- 
ply provide large downloadable meta- 
data sets in MARC 21 format, they 
have essentially done this by making 
the metadata available for harvesting. 
While the OAI-PMH protocol requires 
that metadata be provided at least in 
unqualified DC, it does support the 
ability for metadata providers to make 
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<?xml version="1.0" encoding="UTF-8"?> 

<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www. w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.Org/OAI/2.0/OAI-PMH.xsd"> 

<responseDate>2008-12-23T13:10:31Z</responseDate> 

<request verb="ListSets">http://quod.lib.umich.edu/cgi/o/oai/oai</request> 

<ListSets> 
<set> 

<setSpeohathitrnst</setSpec> 
<setName>HathiTrnst digital repository</setName> 

<setDescription>The "hathitrust" sets in this repository contain records from the HathiTrust digital repository, 
formerly MBooks. The HathiTrust digital repository is the access system to the digitized collections of some of the 
nation's great research libraries. For more on the HathiTrust, view the website at http://www.hathitrnst.org/. </setDe- 
scription> 
</set> 
<set> 

<setSpec>hathitrust:pd</setSpec> 

<setName>Public domain items worldwide </setName> 
</set> 
<set> 

<setSpec>hathitrust:pdus</setSpec> 

<setName> Public domain items according to copyright law in the United States </setName> 
</set> 
<set> 

<setSpec>dlps</setSpec> 

<setName> Digital Library Production Service (DLPS) digital objects </setName> 
</set> 
<set> 

<setSpec>dlpstext</setSpec> 

<setName> Digital Library Production Service (DLPS) text collections </setName> 
</set> 
<set> 

<setSpec>dlps:alajournals</setSpec> 

<setName> Abraham Lincoln Association Journals </setName> 
</set> 



Figure 6. ListSets Response from the University of Michigan OAI-PMH Server 



additional schemas available for har- 
vest. UM Library has chosen to provide 
their metadata records both in DC and 
MARC 21 XML. Since MARC 21 
XML records can be translated directly 
to MARC 21, one only needs to decide 
to harvest the metadata. 

Using MarcEdit as an OAI har- 
vester, the process is relatively simple. 
As noted above, OAI-PMH harvest- 
ing requires the definition of a base 
URL and the identification of the set 
to be harvested. UM Library is cur- 
rently making metadata for numerous 
digital collections available through 
its OAI-PMH service, though it has 
separated its collection into three sets. 



These sets can be quickly identified 
by making a direct query to the OAI- 
PMH service, requesting the names 
and information needed to harvest the 
collection. In this case, the OAI-PMH 
command that would be used is the 
ListSets command. The ListSets com- 
mand will return connection informa- 
tion about the collections being hosted 
on the server; figure 6 shows an sample 
response to the ListSets command. 

The next step is to configure 
the MarcEdit to harvest the meta- 
data. Figure 7 shows how to set the 
definitions in MarcEdit's OAI-PMH 
Harvester. Using the configuration pre- 
sented in figure 7, the MarcEdit OAI 



Harvester would capture all metadata 
records from the mbooks:pdus set, 
translate items from MARC 21 XML 
to MARC 21, and convert all UTF-8 
data to MARC-8. 

Several problems commonly occur 
during the harvesting process. The 
most frequent is server nonresponsive- 
ness. During numerous test harvests of 
the UM Library collection, this author 
found that their OAI-PMH server 
would often drop harvesting requests 
after processing approximately one 
hundred thousand items. Most OAI- 
PMH harvesters provide some sup- 
port for recovering from these types 
of failures, providing the information 
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needed to resume the data harvest at 
the point where the connection was lost. 
MarcEdit uses a two-stage approach to 
recover from this error and resolve 
problems for users when possible. In 
the case of server nonresponsiveness, 
MarcEdit uses a tapered approach to 
data harvesting and issues multiple 
data requests at differing intervals to 
adjust for server nonresponsiveness. 
Additionally, if harvesting is stopped 
for any reasons, the user can resume 
harvesting metadata incrementally 
using the last resumption token pro- 
cessed by the software. This way, if the 
OAI-PMH server drops the harvest- 
ing connection, the request can be 
restarted where it stopped. In addition 
to dropped connections, other issues 
that may be present are errors within 
the metadata themselves (encoding 
errors) or the MARC-encoded data. 
When these records were first made 
available, a number of records within 
the harvested metadata set included 
invalid MARC data. Unfortunately, 
this represents a frequent problem 
with harvestable metadata. MarcEdit's 
OAI Harvester facilitates the correc- 
tion and flagging of these types of 
issues by correcting metadata records 
with errors relating to the structural 
output of the records. For its part, UM 
Library quickly fixed these errors when 
reported, but invalid data elements are 
always an issue when dealing with 
metadata from remote sites. 

Conclusion 

As more institutions bring digital col- 
lections online, technical services 
staff will continue to face the growing 
issue of distributed metadata retriev- 
al. Unlike their print cousins, today's 
institutional repositories and digital 
collections give rise to metadata of a 
distributed nature that require tech- 
nical services departments to think 
creatively and produce workflows that 
encourage repurposing data. Tools 
like MarcEdit's OAI-PMH Harvester 
simplify that process for staff by 
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allowing nontechnical 
users to harvest meta- 
data in various for- 
mats without dealing 
with issues relating 
to XML validation or 
character encodings. 
For too long, techni- 
cal services staff has 
viewed metadata har- 
vesting and transfor- 
mation as a job for 
library technology 
departments. As new 
tools and workflows 
continue to be devel- 
oped, more technical 
services departments 
will likely turn to 
metadata harvesting 
and capture as a viable 
method of generating 
metadata for digital collections. 
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Appendix A. OAI MARC to MARC 21 XML Conversion 

<?xml version="1.0" encoding="UTF-8"?> 

<xsl:stylesheet xmlns="http://www.loc.gov/MARC21/slim" xmlns:oai="http://www.openarchives.org/OAI/l.l/oai_marc" 
version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" exclude-result-prefixes="oai"> 
<xsl:template match=7"> 

<collection xmlns:xsi= "http://www.w3. org/200 1/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/ 
MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" > 
<xsl:apply-templates /> 

</collection> 
</xsl:template> 

<xsl:template name="OAI-PMH"> 

<xsl:for-each select = "ListRecords/record/metadata/oai:oai_marc"> 
<xsl:apply-templates /> 
</xsl:for-each> 

<xsl:for-each select = "GetRecord/record/metadata/oai:oai_marc"> 
<xsl:apply-templates / > 
</xsl:for-each> 
</xsl:template> 

<xsl:template match ="text()" /> 
<xsl:template match="oai:oai_marc"> 

<record xmlns:xsi="http://www.w3. org/200 1/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/ 
slim 

http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" > 
<leader> 

<xsl:text> </xsl:text> 

<xsl:value-of select="@statns"/> 

<xsl:value-of select ="@type"/> 

<xsl:value-of select="@level'7> 

<xsl:text> 22 </xsl:text> 

<xsl:value-of select="@encLvl'7> 

<xsl:value-of select="@catForm"/> 

<xsl:text> 4500</xsl:text> 
</leader> 

<xsl:apply-templates select="oai:fixfieldloai:varfield"/> 
</record> 
</xsl:template> 

<xsl:template match="oai:fixfield"> 
<xsl:element name = "controlfield"> 
<xsl:call-template name="id2tag'7 > 

<xsl:value-of select="snbstring(text(),2,string-length(text())-2)7> 
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</xsl:element> 
</xsl:tempkte> 

<xsl:template match="oai:varfield"> 
<xsl:element name="datafield"> 
<xsl:call-template name ="id2tag"/ > 

<xsl:attribute name="indl"> 

<xsl:call-template name="idBlankSpace"> 

<xsl:with-param name = "value" select="@il"/> 

</xsl:call-template> 
</xsl:attribute> 

<xsl:attribute name="ind2"> 

<xsl:call-template name="idBlankSpace"> 

<xsl:with-param name="value" select="@i2"/> 

</xsl:call-template> 
</xsl:attribute> 

<xsl:apply-templates select="oai:subfield"/> 
</xsl:element> 
</xsl:template> 

<xsl:template match ="oai:snbfield"> 
<xsl:element name="subfield"> 
<xsl:attribute name="code"> 

<xsl: value-of select = "@label"/> 
</xsl:attribute> 
<xsl:value-of select="text()"/> 
</xsl:element> 
</xsl:template> 

<xsl:template name="id2tag"> 
<xsl:attribute name="tag"> 

<xsl:variable name="tag" select="@id"/> 
<xsl:choose> 

<xsl:when test="string-length($tag) = l"> 
<xsl:text>00</xsl:text> 
<xsl:value-of select="$tag"/> 
</xsl:when> 

<xsl:when test="string-length($tag)=2"> 

<xsl:text>0</xsl:text> 

<xsl:value-of select="$tag"/> 
</xsl:when> 

<xsl:when test="string-length($tag)=3"> 

<xsl:value-of select="$tag"/> 
</xsl:when> 
</xsl:choose> 
</xsl: attribute > 
</xsl:template> 

<xsl:template name="idBlankSpace"> 
<xsl:param name="value"/ > 
<xsl:choose> 

<xsl:when test="string-length($value)=0"> 
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<xsl:text> </xsl:text> 
</xsl:when> 
<xsl:otherwise> 

<xsl:value-of select="$value'7> 
</xsl:otherwise> 
</xsl:choose> 
</xsl:template> 
</xsl:stylesheet> 



Appendix B. Sample Generated Record 

=LDR 02358ntm 22003371a 45 

=008 051012s2008\\\\xx\a\\\\\bm\\\000\0\eng\d 

=040\\$aORE$cORE 

=049\\$aOREV 

=090\\$aLD4330 2008$bMarcum, Wade R. 
= 100 l\$aMarcum, Wade R. 

=245 10$aThermal hydraulic analysis of the Oregon State TRIGA Reactor using RELAP5-3D /$cby Wade R. Marcum. 
=260\\$cc2008. 

=300\\$axx leaves : $bill. ; $c29 cm. 
=500\\$aPrintout. 

=502 W$aThesis (M.S.)--Oregon State University, 2008. 

=520 W$aOregon State University has recently conducted a complete core conversion analysis as part of the Reduced 
Enrichment for Research and Test Reactors Program. The goals of the thermal hydraulic analyses were to calculate natural 
circulation flow rates, coolant temperatures and fuel temperatures as a function of core power for both the Highly Enriched 
Uranium (HEU) and Low Enriched Uranium (LEU) cores; for steady state and pulsed operation, calculate peak values of 
fuel temperature, cladding temperature, surface heat flux as well as critical heat flux ratio (CHFR) and temperature profiles 
in hot channel for both the HEU and LEU cores; finally, perform accident analyses for the accident scenarios identified in 
the Oregon State TRIGA(reg) Reactor (OSTR) Safety Analysis Report (SAR). RELAP5-3D Version 2.4.2 was used for all 
computational modeling during the thermal hydraulics analysis. This is a lumped parameter code forcing engineering assump- 
tions to be made during the analysis. A single hot channel model’s results are compared to that produced from more 
refined two and eight channel models in order to identify variations in thermal hydraulic characteristics as a function of spatial 
refinement. 

=530 W$aAlso available on the World Wide Web. 

=583\\$xlssue Date: 2008-04-03T23:16:26Z 

=504 \\$alncludes bibliographical references (leaves - ). 

=650 \0$aTRIGA reactors. 

=650 \0$aTRIGA reactors $xSafety measures. 

=650 \0$aNuclear reactors $xFluid dynamics. 

=650 \0$aHeat flux. 

=650\0$aRELAP5-3D. 

=690\\$aTheses, OSU$xNuclear Engineering. 
=856 41$uhttp://hdl.handle.net/1957/8272 
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Metadata. By Marcia Lei Zeng and 
Jian Qin. New York: Neal-Schuman, 
2008. 365p. $65.00 softbound (ISBN 
978-1-55570-635-7/1-55570-635-5). 

Professors Marcia Lei Zeng and 
Jian Qin each have several years of 
experience as professional educators 
and researchers in the fields of knowl- 
edge organization and metadata (365). 
One fruit of their collective labors 
is the recently published Metadata, 
a comprehensive assessment of the 
theory and practice of metadata design 
and implementation. The authors have 
endeavored to take a "unique, enlight- 
ening, and holistic approach" (xvii) to 
the subject of metadata by fashioning 
a dual-purpose manual: a textbook for 
students and an "instructional guide" 
(xv) for metadata specialists. 

Metadata is divided into four 
parts. The first is dedicated to basic 
principles and definitions and includes 
chapter 2, "Current Standards," a sur- 
vey of the semantics and structures of 
various metadata schemes and content 
standards, many common in library 
and related institutions (e.g., Machine- 
Readable Cataloging (MARC), 
Encoded Archival Description (EAD), 
and Content Standard for Digital 
Geospatial Metadata (CSDGM)), 
others less popular (e.g., vCard and 
Sharable Content Object Reference 
Model (SCORM)). The Text Encoding 
Initiative (TEI) scheme, notably the 
metadata-centric header, is mentioned 
little in this chapter, with occasional 
references elsewhere in the volume. 
Notwithstanding the authors' state- 
ment that no partiality is intended 
with a scheme's exclusion (xvi), con- 
sidering the popularity of TEI with 
digitized text and the historical role its 
header played as a model for EAD's 
own header, I still find this omission 
surprising. 



Part 2 moves away from definitions 
to concentrate on constructs. Chapter 
3, "Schemas — Structure and Se- 
mantics," offers an overview of element 
sets, application profiles, crosswalks, 
and best practices documentation. The 
following chapter, "Schemas — Syntax," 
discusses the encoding of metada- 
ta for computer manipulation while 
positing the advantages of Extensible 
Markup Language (XML). Chapter 5, 
"Metadata Records," brings together 
principles from the previous chap- 
ters to lay the conceptual groundwork 
for — and describe the encoding of — 
standards-based metadata records. 

Metadata registries, repositories, 
and metadata sharing are the focus of 
"Metadata Services," the first chapter 
of part 3. Zeng and Qin continue their 
discussion of interoperability in chap- 
ter 8, "Achieving Interoperability," 
concentrating here on issues regard- 
ing metadata content and schemas 
that are distributed within or among 
metadata repositories. Between these 
two chapters lies "Metadata Quality 
Measurement and Improvement" 
(chapter 7), the content of which is 
self-evident from the title. I make 
special note of the sections therein 
devoted to (1) a description of the 
varying methods of analysis as a means 
to achieving quality metadata, and 
(2) the practical and, in my reading, 
implied ethical ramifications of poor 
metadata. 

Part 4 consists of a single chapter 
devoted to assorted topics on meta- 
data research, such as investigations 
on semantics and conceptual metadata 
modeling. 

A pair of appendixes caps off the 
work. The first is an annotated list of 
various metadata schemes; the second 
is a short compendium of sources 
for controlled vocabularies, content 



standards, and best practices guides. 
These are followed by a well-rounded 
glossary, a rich bibliography (with many 
citations pointing to online resources), 
and a fine index. 

Furthermore, Zeng and Qin offer 
a variety of aids throughout the work 
to assist the reader in learning spe- 
cific concepts and practices. An array 
of helpful illustrations emphasizes or 
further delineates their points, typi- 
cally taking the form of screenshots 
of applications, diagrams illustrating 
various principles, and tables of data. 
Each chapter ends with a selection of 
exercises for the student, all of which 
are also found on a companion website 
(www. metadataetc.org/book-website ) . 
The site likewise presents links to 
online information resources on meta- 
data (duplicating the bibliographical 
references found at the end of each 
chapter in the book), quizzes, and 
further exercises. (These supplemen- 
tary online materials are also available 
from the publisher on CD-ROM. I 
did not receive a copy of the disc for 
review. ) The website in turn points to a 
wiki (www. metadataetc.org/wiki/index 
.php5?title=Main_Page) that supplies 
information on updates to the com- 
panion website, offers an online ver- 
sion of the book's glossary, and devotes 
a section to listing errata found in the 
textbook. 

Metadata covers much ground, as 
the brief outline above demonstrates. 
Considering the scope of the work, 
I posit the question, Did Zeng and 
Qin succeed in their mission to create 
a suitable textbook for the metadata 
student? The answer is yes, with some 
qualifications. 

First, a hazard of employing nar- 
rative form in textbooks is the some- 
times brief, early presentation of a 
concept not solidly defined until later 
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in the text. This pitfall appears several 
times in Metadata. For example, the 
authors report on application profiles 
for education communities (49-50) 
well before they give a thorough 
account of the meaning and uses of 
application profiles (112-19). Often 
the reader can resolve any confusion 
with a consultation of the glossary. 
Many of the important terms the writ- 
ers bring to bear in the text are defined 
here; definitions of lesser substance 
they confine only to the text body 
itself (e.g., the data encoding terms 
wrapper and compound elements and 
the concept of ontological modeling). 
When encountering such terms, the 
student must employ the index, thus 
establishing an inconvenient route to 
discovering the meaning of various 
unfamiliar expressions. 

Next, Zeng and Qin acknowledge 
that their book does not provide a 
primer for encoding data for computer 
use (134). Metadata therefore requires 
some prerequisite knowledge of XML 
and Extensible Hypertext Markup 



Language (XHTML) to understand 
the encoding examples. This is one 
reason I caution against the neophyte, 
especially one lacking a technical back- 
ground, using this text for self-study. 
For them, the writing style may be 
dense and at times opaque; for exam- 
ple, one section I find daunting — and 
the newcomer most likely also will — is 
the short but thickly technical expla- 
nation of the MPEG-7 standard, with 
which I was not previously familiar 
(77-79). Students here would benefit 
from the assistance of a mentor or 
practiced work colleague or consulta- 
tion of other resources to understand 
the material put forward in such a 
concentrated manner. 

These minor criticisms aside, 
Metadata and its companion web- 
sites offer an excellent foundation for 
advanced class instruction; the guiding 
hand of a knowledgeable classroom 
instructor is required to traverse some 
of the more difficult subject matter. 
Moreover, the writers explicitly state 
in the preface that "the text is not a 



step-by-step manual for creating meta- 
data records" (xv-xvi). Most metadata 
course syllabi with which I am familiar 
call for extensive engagement with 
specific metadata content standards 
and records. Thus an instructor using 
this textbook in such a circumstance 
may find it challenging to provide 
in a single semester coursework on 
these standards in tandem with a full 
account of the readings offered in 
Metadata. An option to consider is a 
separate course for the practical appli- 
cation of metadata, which is alluded to 
by the authors (16). 

The other audience to whom 
Zeng and Qin direct their work, meta- 
data practitioners with appropriate 
background experience, will find this 
book a very good source for refer- 
ence, review, and for expanding their 
knowledge on facets that lie outside 
their respective fields of expertise. — 
Mark K. Ehlert (ehler043@umn 
.edu), MINITEX Library Information 
Network, Minneapolis. 
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